skip to main content
10.1145/1882992.1882999acmotherconferencesArticle/Chapter ViewAbstractPublication PagesihiConference Proceedingsconference-collections
research-article

Modeling annotation time to reduce workload in comparative effectiveness reviews

Published:11 November 2010Publication History

ABSTRACT

Comparative effectiveness reviews (CERs), a central methodology of comparative effectiveness research, are increasingly used to inform healthcare decisions. During these systematic reviews of the scientific literature, the reviewers (MD-methodologists) must screen several thousands of citations for eligibility according to a pre-specified protocol. While previous research has demonstrated the theoretical potential of machine learning to reduce the workload in CERs, practical obstacles to deploying such a system remain. In this article, we describe work on an end-to-end, interactive machine learning system for assisting reviewers with the tedious task of citation screening for CERs. Specifically, we present ABSTRACKR, our open-source annotation tool. In addition to allowing reviewers to designate citations as 'relevant' or 'irrelevant' to the review at hand, ABSTRACKR facilitates communicating other information useful to the classification model, such as terms that are suggestive of the relevance (or irrelevance) of a citation. The tool also records the time taken to screen citations, over which we conducted a time-series analysis to derive an annotator model. Using this model, we found that both the order in which the citations are screened and the length of each citation affect annotation time. We propose a strategy that integrates labeled terms and timing data into the Active Learning (AL) framework, in which an algorithm selects citations for the reviewer to label. We demonstrate empirically that this additional information can improve the performance of the semi-automated citation screening system.

References

  1. P. Wheeler, E. Balk, K. Bresnahan, B. Shephard, J. Lau, D. DeVine, M. Chung, and K. Miller. Criteria for determining disability in infants and children: short stature. Evidence Report/Technology Assessment No. 73. Prepared by New England Medical Center Evidence-based Practice Center under Contract No. 290-97-001, Mar 2003.Google ScholarGoogle Scholar
  2. C. Cole, G. Binney, P. Casey, J. Fiascone, J. Hagadorn, C. Kim, C. Wang, D. Devine, K. Miller, and J. Lau. Criteria for determining disability in infants and children: Low birth weight. Evidence Report/Technology Assessment No. 70. Prepared by New England Medical Center Evidence-based Practice Center under Contract No. 290-97-0019, 2002.Google ScholarGoogle Scholar
  3. E. Perrin, C. Cole, D. Frank, S. Glicken, N. Guerina, K. Petit, R. Sege, M. Volpe, P. Chew, C. MeFadden, D. Devine, K. Miller, and J. Lau. Criteria for determining disability in infants and children: failure to thrive. Evidence Report/Technology Assessment No. 72. Prepared by New England Medical Center Evidence-based Practice Center under Contract No. 290-97-0019, Mar 2003.Google ScholarGoogle Scholar
  4. S. Arora, E. Nyberg, and C. Rosé. Estimating annotation cost for active learning in a multi-annotator environment. In Proceedings of the North American Chapter of the Association for Computational Linguistics (NAACL) Workshop on Active Learning for Natural Language Processing, pages 18--26. Association for Computational Linguistics, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. J. Baldridge and A. Palmer. How well does active learning actually work?: Time-based evaluation of cost-reduction strategies for language documentation. In Empirical Methods on Natural Language Processing (EMNLP), pages 296--305. Association for Computational Linguistics, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. P. Donmez and J. G. Carbonell. Proactive learning: cost-sensitive active learning with multiple imperfect oracles. In Conference on Information and Knowledge Management (CIKM), pages 619--628, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. R. Haertel, K. Seppi, E. Ringger, and J. Carroll. Return on investment for active learning. In NIPS Workshop on Cost Sensitive Learning, 2009.Google ScholarGoogle Scholar
  8. N. Japkowicz. Learning from imbalanced data sets: A comparison of various strategies. AAAI Workshop on Learning from Imbalanced Data Sets, 2000.Google ScholarGoogle Scholar
  9. A. Mccallum and K. Nigam. Employing EM and pool-based active learning for text classification. In International Conference on Machine Learning (ICML), pages 350--358, San Francisco, CA, USA, 1998. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. B. Settles. Active learning literature survey. Computer Sciences Technical Report 1648, University of Wisconsin-Madison, 2009.Google ScholarGoogle Scholar
  11. B. Settles, M. Craven, and L. Friedland. Active learning with real annotation costs. In Proceedings of the Neural Information Processing Systems (NIPS) Workshop on Cost-Sensitive Learning, pages 1069--1078. Citeseer, 2008.Google ScholarGoogle Scholar
  12. K. Tomanek and F. Olsson. A web survey on the use of active learning to support annotation of text data. In NAACL Workshop on AL for NLP, pages 45--48, June 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. S. Tong and D. Koller. Support vector machine active learning with applications to text classification. In Journal of Machine Learning Research, pages 999--1006, 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. A. J. Vickers and E. B. Elkin. Decision curve analysis: A novel method for evaluating prediction models. Medical Decision Making, 26: 565--574, 2006.Google ScholarGoogle ScholarCross RefCross Ref
  15. B. C. Wallace, K. Small, C. E. Brodley, and T. A. Trikalinos. Active learning for biomedical citation screening. In Knowledge Discovery and Data mining (KDD), 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. B. C. Wallace, T. A. Trikalinos, J. Lau, C. E. Brodley, and C. H. Schmid. Semi-automated screening of biomedical citations for systematic reviews. BMC Bioinformatics, 11, 2010.Google ScholarGoogle Scholar

Index Terms

  1. Modeling annotation time to reduce workload in comparative effectiveness reviews

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Other conferences
      IHI '10: Proceedings of the 1st ACM International Health Informatics Symposium
      November 2010
      886 pages
      ISBN:9781450300308
      DOI:10.1145/1882992

      Copyright © 2010 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 11 November 2010

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader