skip to main content
10.1145/3269206.3271796acmconferencesArticle/Chapter ViewAbstractPublication PagescikmConference Proceedingsconference-collections
research-article

Effective User Interaction for High-Recall Retrieval: Less is More

Published:17 October 2018Publication History

ABSTRACT

High-recall retrieval --- finding all or nearly all relevant documents --- is critical to applications such as electronic discovery, systematic review, and the construction of test collections for information retrieval tasks. The effectiveness of current methods for high-recall information retrieval is limited by their reliance on human input, either to generate queries, or to assess the relevance of documents. Past research has shown that humans can assess the relevance of documents faster and with little loss in accuracy by judging shorter document surrogates, e.g.\ extractive summaries, in place of full documents. To test the hypothesis that short document surrogates can reduce assessment time and effort for high-recall retrieval, we conducted a 50-person, controlled, user study. We designed a high-recall retrieval system using continuous active learning (CAL) that could display either full documents or short document excerpts for relevance assessment. In addition, we tested the value of integrating a search engine with CAL. In the experiment, we asked participants to try to find as many relevant documents as possible within one hour. We observed that our study participants were able to find significantly more relevant documents when they used the system with document excerpts as opposed to full documents. We also found that allowing participants to compose and execute their own search queries did not improve their ability to find relevant documents and, by some measures, impaired performance. These results suggest that for high-recall systems to maximize performance, system designers should think carefully about the amount and nature of user interaction incorporated into the system.

References

  1. Mustafa Abualsaud, Nimesh Ghelani, Haotian Zhang, Mark D. Smucker, Gordon V. Cormack, and Maura R. Grossman. 2018. A System for Efficient High-Recall Retrieval. In SIGIR . 1317--1320. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. James Allan. 2004. HARD Track Overview in TREC 2004: High Accuracy Retrieval from Documents. In TREC .Google ScholarGoogle ScholarCross RefCross Ref
  3. James Allan, Evangelos Kanoulas, Dan Li, Christophe Van Gysel, Donna Harman, and Ellen Voorhees. 2017. TREC 2017 Common Core Track Overview. In TREC .Google ScholarGoogle Scholar
  4. Antonios Anagnostou, Athanasios Lagopoulos, Grigorios Tsoumakas, and Ioannis P. Vlahavas. 2017. Combining Inter-Review Learning-to-Rank and Intra-Review Incremental Training for Title and Abstract Screening in Systematic Reviews. In CLEF .Google ScholarGoogle Scholar
  5. Javed A Aslam, Virgiliu Pavlu, and Robert Savell. 2003. A unified model for metasearch, pooling, and system evaluation. In CIKM . 484--491. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Jason R Baron, David D Lewis, and Douglas W Oard. 2006. TREC 2006 Legal Track Overview. In TREC .Google ScholarGoogle Scholar
  7. Gaurav Baruah, Haotian Zhang, Rakesh Guttikonda, Jimmy Lin, Mark D Smucker, and Olga Vechtomova. 2016. Optimizing Nugget Annotations with Active Learning. In CIKM . 2359--2364. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Douglas Bates, Martin M"achler, Ben Bolker, and Steve Walker. 2015. Fitting Linear Mixed-Effects Models Using lme4 . J. of Stat. Soft. , Vol. 67, 1, 1--48. %%Google ScholarGoogle ScholarCross RefCross Ref
  9. David C Blair and%% Melvin E Maron. 1985. %% An evaluation of retrieval effectiveness for a%% full-text document-retrieval system. %% Commun. ACM , Vol. 28,%% 3 (1985), 289--299. %% %% Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Charles L. A. Clarke, Nick%% Craswell, and Ian Soboroff. %% 2009. %% Overview of the TREC 2009 Web Track. In%% TREC .%%Google ScholarGoogle Scholar
  11. Gordon V Cormack and Maura R Grossman. 2014. Evaluation of machine-learning protocols for technology-assisted review in electronic discovery. In SIGIR . 153--162. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Gordon V. Cormack and Maura R. Grossman. 2015. Autonomy and Reliability of Continuous Active Learning for Technology-Assisted Review. CoRR , Vol. abs/1504.06868 (2015).Google ScholarGoogle Scholar
  13. Gordon V. Cormack and Maura R. Grossman. 2017. Technology-Assisted Review in Empirical Medicine: Waterloo Participation in CLEF eHealth 2017. In CLEF .Google ScholarGoogle Scholar
  14. Gordon V Cormack and Mona Mojdeh. 2009. Machine Learning for Information Retrieval: TREC 2009 Web, Relevance Feedback and Legal Tracks. In TREC .Google ScholarGoogle Scholar
  15. Gordon V Cormack, Christopher R Palmer, and Charles LA Clarke. 1998. Efficient construction of large test collections. In SIGIR . 282--289. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Maura Grossman, Gordon Cormack, and Adam Roegiest. 2016. TREC 2016 Total Recall Track Overview. In TREC .Google ScholarGoogle Scholar
  17. Maura R Grossman, Gordon V Cormack, and Adam Roegiest. 2017. Automatic and Semi-Automatic Document Selection for Technology-Assisted Review. In SIGIR. 905--908. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Donna Harman. 2011. Information Retrieval Evaluation .Morgan & Claypool. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Bruce Hedin, Stephen Tomlinson, Jason R Baron, and Douglas W Oard. 2009. Overview of the TREC 2009 legal track .In TREC .Google ScholarGoogle Scholar
  20. Evangelos Kanoulas, Dan Li, Leif Azzopardi, and Rene Spijker. 2017. Clef 2017 technologically assisted reviews in empirical medicine overview. In CLEF. 11--14.Google ScholarGoogle Scholar
  21. Inderjeet Mani, Gary Klein, David House, Lynette Hirschman, Therese Firmin, and Beth Sundheim. 2002. SUMMAC: a text summarization evaluation. Natural Language Engineering , Vol. 8, 1 (2002), 43--68. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. David Milne. 2014. WikipediaMiner. https://github.com/dnmilne/wikipediaminer.Google ScholarGoogle Scholar
  23. Makoto Miwa, James Thomas, Alison O'Mara-Eves, and Sophia Ananiadou. 2014. Reducing systematic review workload through certainty-based screening. Journal of biomedical informatics , Vol. 51 (2014), 242--253. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Marcia Munoz and Ramya Nagarajan. 2001. Sentence Splitter. Cognitive Computation Group, CS, UIUC, http://cogcomp.org/page/tools_view/2.Google ScholarGoogle Scholar
  25. Douglas W Oard, Bruce Hedin, Stephen Tomlinson, and Jason R Baron. 2008. Overview of the TREC 2008 legal track .In TREC .Google ScholarGoogle Scholar
  26. Douglas W Oard and William Webber. 2013. Information retrieval for e-discovery. Foundations and Trends® in Information Retrieval , Vol. 7, 2--3, 99--237.Google ScholarGoogle ScholarCross RefCross Ref
  27. R Core Team. 2014. R: A Language and Environment for Statistical Computing . R Foundation for Statistical Computing. http://www.R-project.org/Google ScholarGoogle Scholar
  28. Adam Roegiest, Gordon Cormack, Maura Grossman, and Charles Clarke. 2015. TREC 2015 Total Recall track overview. In TREC .Google ScholarGoogle Scholar
  29. Mark Sanderson and Hideo Joho. 2004. Forming test collections with no system pooling. In SIGIR. 33--40. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. Evan Sandhaus. 2008. The New York Times Annotated Corpus. (October 2008). LDC Catalog No.: LDC2008T19, https://catalog.ldc.upenn.edu/ldc2008t19.Google ScholarGoogle Scholar
  31. Mark D Smucker and Chandra Prakash Jethani. 2010. Human performance and retrieval precision revisited. In SIGIR. 595--602. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. Ian Soboroff and Stephen Robertson. 2003. Building a filtering test collection for TREC 2002. In SIGIR. 243--250. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. Trevor Strohman, Donald Metzler, Howard Turtle, and W. Bruce Croft. 2005. Indri: A language-model based search engine for complex queries (extended version) . Technical Report IR-407. CIIR, CS Dept., U. of Mass. Amherst.Google ScholarGoogle Scholar
  34. John Tredennick. 2011. E-Discovery, My How You've Grown! https://catalystsecure.com/blog/2011/04/e-discovery-my-how-youve-grown/.Google ScholarGoogle Scholar
  35. Juliá n Urbano and Mó nica Marrero. 2017. The Treatment of Ties in AP Correlation. In ICTIR. 321--324. Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. Ellen M Voorhees. 2001. Evaluation by highly relevant documents. In SIGIR . 74--82. %% Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. Ellen M. Voorhees and%% Donna Harman. 1999. %% Overview of the Eighth Text REtrieval Conference%% (TREC-8). In TREC .%%Google ScholarGoogle Scholar
  38. Ellen M Voorhees and Donna K Harman. 2005. The text retrieval conference. TREC: Experiment and evaluation in information retrieval (2005), 3--19.Google ScholarGoogle Scholar
  39. Ellen M Voorhees, Donna K Harman, et almbox. 2005. TREC: Experiment and evaluation in information retrieval. Vol. 1. MIT press Cambridge. Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. Emine Yilmaz, Javed A. Aslam, and Stephen Robertson. 2008. A New Rank Correlation Coefficient for Information Retrieval. In SIGIR. 587--594. Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. Haotian Zhang, Mustafa Abualsaud, Nimesh Ghelani, Angshuman Ghosh, Mark D. Smucker, Gordon V. Cormack, and Maura R. Grossman. 2017. UWaterlooMDS at the TREC 2017 Common Core Track. In TREC .Google ScholarGoogle Scholar
  42. Haotian Zhang, Gordon V. Cormack, Maura R. Grossman, and Mark D. Smucker. 2018. Evaluating Sentence-Level Relevance Feedback for High-Recall Information Retrieval. CoRR , Vol. abs/1803.08988 (2018).Google ScholarGoogle Scholar
  43. Haotian Zhang, Jimmy Lin, Gordon V Cormack, and Mark D Smucker. 2016. Sampling Strategies and Active Learning for Volume Estimation. In SIGIR. 981--984. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Effective User Interaction for High-Recall Retrieval: Less is More

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Conferences
      CIKM '18: Proceedings of the 27th ACM International Conference on Information and Knowledge Management
      October 2018
      2362 pages
      ISBN:9781450360142
      DOI:10.1145/3269206

      Copyright © 2018 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 17 October 2018

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article

      Acceptance Rates

      CIKM '18 Paper Acceptance Rate147of826submissions,18%Overall Acceptance Rate1,861of8,427submissions,22%

      Upcoming Conference

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader