skip to main content
10.1145/3477495.3531667acmconferencesArticle/Chapter ViewAbstractPublication PagesirConference Proceedingsconference-collections
short-paper

A2A-API: A Prototype for Biomedical Information Retrieval Research and Benchmarking

Published:07 July 2022Publication History

ABSTRACT

Finding relevant literature is crucial for biomedical research and in the practice of evidence-based medicine, making biomedical search an important application area within the field of information retrieval. This is recognised by the broader IR community, and in particular by the organisers of Text Retrieval Conference (TREC) as early as 2003. While TREC provides crucial evaluation resources, to get started in biomedical IR one needs to tackle an important software engineering hurdle of parsing, indexing, and deploying several large document collections. Moreover, many newcomers to the field often face a steep learning curve, where theoretical concepts are tangled up with technical aspects. Finally, many of the existing baselines and systems are difficult to reproduce.

We aim to alleviate all three of these bottlenecks with the launch of A2A-API. It is a RESTful API which serves as an easy-to-use and programming-language-independent interface to existing biomedical TREC collections. It builds upon A2A, our system for biomedical information retrieval benchmarking, and extends it with additional functionalities. Apart from providing programmatic access to the features of the original A2A system - focused principally on benchmarking - A2A-API supports biomedical IR researchers in development of systems featuring reranking and query reformulation components. In this demonstration, we illustrate the capabilities of A2A-API with comprehensive use cases.

References

  1. Apache. 2016. http://lucene.apache.org/solr/. [Version: 6.0.1].Google ScholarGoogle Scholar
  2. William Hersh and Ellen Voorhees. 2009. TREC Genomics Special Issue Overview. Information Retrieval 12, 1 (2009), 1--15.Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Sarvnaz Karimi, Vincent Nguyen, Falk Scholer, Brian Jin, and Sara Fala- maki. 2018. A2A: Benchmark Your Clinical Decision Support Search. In SIGIR. Ann Arbor, MI, 1277--1280.Google ScholarGoogle Scholar
  4. Bevan Koopman and Guido Zuccon. 2016. A test collection for matching patients to clinical trials. In Proceedings of the 39th International ACM SIGIR conference on Research and Development in Information Retrieval. 669--672.Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Jimmy Lin, Xueguang Ma, Sheng-Chieh Lin, Jheng-Hong Yang, Ronak Pradeep, and Rodrigo Nogueira. 2021. Pyserini: A Python Toolkit for Reproducible Information Retrieval Research with Sparse and Dense Representations. In SIGIR. 2356--2362.Google ScholarGoogle Scholar
  6. Sean MacAvaney, Andrew Yates, Sergey Feldman, Doug Downey, Ar- man Cohan, and Nazli Goharian. 2021. Simplified Data Wrangling with ir_datasets. In Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval. 2429--2436.Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Vincent Nguyen, Sarvnaz Karimi, and Brian Jin. 2019. An Experimentation Platform for Precision Medicine. In SIGIR. Paris, France, 1357--1360.Google ScholarGoogle Scholar
  8. Rodrigo Nogueira and Kyunghyun Cho. 2019. Passage Re-ranking with BERT. arXiv:1901.04085 (2019). arXiv:1901.04085 [cs.IR]Google ScholarGoogle Scholar
  9. Iadh Ounis, Gianni Amati, Vassilis Plachouras, Ben He, Craig Macdonald, and Douglas Johnson. 2005. Terrier Information Retrieval Platform. In ECIR, Vol. 3408. 517--519.Google ScholarGoogle Scholar
  10. Kirk Roberts, Tasmeer Alam, Steven Bedrick, Dina Demner-Fushman, Kyle Lo, Ian Soboroff, Ellen Voorhees, Lucy Lu Wang, and William Hersh. 2020. TREC-COVID: Rationale and Structure of an Information Retrieval Shared Task for COVID-19. The Journal of the American Medical Informatics Association 27, 9 (2020), 1431--1436.Google ScholarGoogle ScholarCross RefCross Ref
  11. Kirk Roberts, Dina Demner-Fushman, Ellen Voorhees, William R. Hersh, Steven Bedrick, Alexander Lazar, and Shubham Pant. 2017. Overview of the TREC 2017 Precision Medicine Track. In TREC. Gaithersburg, MD.Google ScholarGoogle Scholar
  12. Kirk Roberts, Dina Demner-Fushman, Ellen M. Voorhees, Steven Bedrick, and William R Hersh. 2020. Overview of the TREC 2020 Precision Medicine Track. In TREC.Google ScholarGoogle Scholar
  13. Kirk Roberts, Dina Demner-Fushman, Ellen M. Voorhees, and William R. Hersh. 2016. Overview of the TREC 2016 Clinical Decision Support Track. In TREC. Gaithersburg, MD.Google ScholarGoogle Scholar
  14. Kirk Roberts, Dina Demner-Fushman, Ellen M. Voorhees, William R. Hersh, Steven Bedrick, and Alexander J. Lazar. 2018. Overview of the TREC 2018 Precision Medicine Track. In TREC. Gaithersburg, MD.Google ScholarGoogle Scholar
  15. Kirk Roberts, Dina Demner-Fushman, Ellen M. Voorhees, William R. Hersh, Steven Bedrick, Alexander J. Lazar, Shubham Pant, and Funda Meric-Bernstam. 2019. Overview of the TREC 2019 Precision Medicine Track. In TREC. Gaithersburg, MD.Google ScholarGoogle Scholar
  16. Kirk Roberts, Matthew S. Simpson, Ellen Voorhees, and William R. Hersh. 2015. Overview of the TREC 2015 Clinical Decision Support Track. In Text REtrieval Conference. Gaithersburg, MD.Google ScholarGoogle Scholar
  17. Maciej Rybinski, Sarvnaz Karimi, and Aleney Khoo. 2021. Science2Cure: A Clinical Trial Search Prototype. In Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval. 2620--2624.Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Maceij Rybinski, Sarvnaz Karimi, Vincent Nguyen, and Cecile Paris. 2020. A2A: A platform for research in biomedical literature search. BMC Bioin-formatics 21, 572 (2020).Google ScholarGoogle Scholar
  19. Maciej Rybinski, Vincent Nguyen, and Sarvnaz Karimi. 2021. CSIROmed Team Report of TREC 2021 Clinical Trials track: Experiments with BERT Reranking Methods. In TREC. Online.Google ScholarGoogle Scholar
  20. M. Simpson, E. Voorhees, and W. Hersh. 2014. Overview of the TREC 2014 Clinical Decision Support Track. In TREC. Gaithersburg, MD.Google ScholarGoogle Scholar
  21. Lucy Lu Wang, Kyle Lo, Yoganand Chandrasekhar, Russell Reas, Jiangjiang Yang, Darrin Eide, Kathryn Funk, Rodney Kinney, Ziyang Liu, William Merrill, Paul Mooney, Dewey Murdick, Devvret Rishi, Jerry Sheehan, Zhihong Shen, Brandon Stilson, Alex D. Wade, Kuansan Wang, Chris Wilhelm, Boya Xie, Douglas Raymond, Daniel S. Weld, Oren Etzioni, and Sebastian Kohlmeier. 2020. CORD-19: The Covid- 19 Open Research Dataset. In ACL NLP-COVID Workshop. Online. https: //arxiv.org/abs/2004.10706Google ScholarGoogle Scholar
  22. Peilin Yang, Hui Fang, and Jimmy Lin. 2017. Anserini: Enabling the Use of Lucene for Information Retrieval Research. In SIGIR. Tokyo, Japan, 1253--1256.Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. A2A-API: A Prototype for Biomedical Information Retrieval Research and Benchmarking

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in
        • Published in

          cover image ACM Conferences
          SIGIR '22: Proceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval
          July 2022
          3569 pages
          ISBN:9781450387323
          DOI:10.1145/3477495

          Copyright © 2022 ACM

          Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

          Publisher

          Association for Computing Machinery

          New York, NY, United States

          Publication History

          • Published: 7 July 2022

          Permissions

          Request permissions about this article.

          Request Permissions

          Check for updates

          Qualifiers

          • short-paper

          Acceptance Rates

          Overall Acceptance Rate792of3,983submissions,20%
        • Article Metrics

          • Downloads (Last 12 months)45
          • Downloads (Last 6 weeks)4

          Other Metrics

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader