ABSTRACT
Finding relevant literature is crucial for biomedical research and in the practice of evidence-based medicine, making biomedical search an important application area within the field of information retrieval. This is recognised by the broader IR community, and in particular by the organisers of Text Retrieval Conference (TREC) as early as 2003. While TREC provides crucial evaluation resources, to get started in biomedical IR one needs to tackle an important software engineering hurdle of parsing, indexing, and deploying several large document collections. Moreover, many newcomers to the field often face a steep learning curve, where theoretical concepts are tangled up with technical aspects. Finally, many of the existing baselines and systems are difficult to reproduce.
We aim to alleviate all three of these bottlenecks with the launch of A2A-API. It is a RESTful API which serves as an easy-to-use and programming-language-independent interface to existing biomedical TREC collections. It builds upon A2A, our system for biomedical information retrieval benchmarking, and extends it with additional functionalities. Apart from providing programmatic access to the features of the original A2A system - focused principally on benchmarking - A2A-API supports biomedical IR researchers in development of systems featuring reranking and query reformulation components. In this demonstration, we illustrate the capabilities of A2A-API with comprehensive use cases.
- Apache. 2016. http://lucene.apache.org/solr/. [Version: 6.0.1].Google Scholar
- William Hersh and Ellen Voorhees. 2009. TREC Genomics Special Issue Overview. Information Retrieval 12, 1 (2009), 1--15.Google ScholarDigital Library
- Sarvnaz Karimi, Vincent Nguyen, Falk Scholer, Brian Jin, and Sara Fala- maki. 2018. A2A: Benchmark Your Clinical Decision Support Search. In SIGIR. Ann Arbor, MI, 1277--1280.Google Scholar
- Bevan Koopman and Guido Zuccon. 2016. A test collection for matching patients to clinical trials. In Proceedings of the 39th International ACM SIGIR conference on Research and Development in Information Retrieval. 669--672.Google ScholarDigital Library
- Jimmy Lin, Xueguang Ma, Sheng-Chieh Lin, Jheng-Hong Yang, Ronak Pradeep, and Rodrigo Nogueira. 2021. Pyserini: A Python Toolkit for Reproducible Information Retrieval Research with Sparse and Dense Representations. In SIGIR. 2356--2362.Google Scholar
- Sean MacAvaney, Andrew Yates, Sergey Feldman, Doug Downey, Ar- man Cohan, and Nazli Goharian. 2021. Simplified Data Wrangling with ir_datasets. In Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval. 2429--2436.Google ScholarDigital Library
- Vincent Nguyen, Sarvnaz Karimi, and Brian Jin. 2019. An Experimentation Platform for Precision Medicine. In SIGIR. Paris, France, 1357--1360.Google Scholar
- Rodrigo Nogueira and Kyunghyun Cho. 2019. Passage Re-ranking with BERT. arXiv:1901.04085 (2019). arXiv:1901.04085 [cs.IR]Google Scholar
- Iadh Ounis, Gianni Amati, Vassilis Plachouras, Ben He, Craig Macdonald, and Douglas Johnson. 2005. Terrier Information Retrieval Platform. In ECIR, Vol. 3408. 517--519.Google Scholar
- Kirk Roberts, Tasmeer Alam, Steven Bedrick, Dina Demner-Fushman, Kyle Lo, Ian Soboroff, Ellen Voorhees, Lucy Lu Wang, and William Hersh. 2020. TREC-COVID: Rationale and Structure of an Information Retrieval Shared Task for COVID-19. The Journal of the American Medical Informatics Association 27, 9 (2020), 1431--1436.Google ScholarCross Ref
- Kirk Roberts, Dina Demner-Fushman, Ellen Voorhees, William R. Hersh, Steven Bedrick, Alexander Lazar, and Shubham Pant. 2017. Overview of the TREC 2017 Precision Medicine Track. In TREC. Gaithersburg, MD.Google Scholar
- Kirk Roberts, Dina Demner-Fushman, Ellen M. Voorhees, Steven Bedrick, and William R Hersh. 2020. Overview of the TREC 2020 Precision Medicine Track. In TREC.Google Scholar
- Kirk Roberts, Dina Demner-Fushman, Ellen M. Voorhees, and William R. Hersh. 2016. Overview of the TREC 2016 Clinical Decision Support Track. In TREC. Gaithersburg, MD.Google Scholar
- Kirk Roberts, Dina Demner-Fushman, Ellen M. Voorhees, William R. Hersh, Steven Bedrick, and Alexander J. Lazar. 2018. Overview of the TREC 2018 Precision Medicine Track. In TREC. Gaithersburg, MD.Google Scholar
- Kirk Roberts, Dina Demner-Fushman, Ellen M. Voorhees, William R. Hersh, Steven Bedrick, Alexander J. Lazar, Shubham Pant, and Funda Meric-Bernstam. 2019. Overview of the TREC 2019 Precision Medicine Track. In TREC. Gaithersburg, MD.Google Scholar
- Kirk Roberts, Matthew S. Simpson, Ellen Voorhees, and William R. Hersh. 2015. Overview of the TREC 2015 Clinical Decision Support Track. In Text REtrieval Conference. Gaithersburg, MD.Google Scholar
- Maciej Rybinski, Sarvnaz Karimi, and Aleney Khoo. 2021. Science2Cure: A Clinical Trial Search Prototype. In Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval. 2620--2624.Google ScholarDigital Library
- Maceij Rybinski, Sarvnaz Karimi, Vincent Nguyen, and Cecile Paris. 2020. A2A: A platform for research in biomedical literature search. BMC Bioin-formatics 21, 572 (2020).Google Scholar
- Maciej Rybinski, Vincent Nguyen, and Sarvnaz Karimi. 2021. CSIROmed Team Report of TREC 2021 Clinical Trials track: Experiments with BERT Reranking Methods. In TREC. Online.Google Scholar
- M. Simpson, E. Voorhees, and W. Hersh. 2014. Overview of the TREC 2014 Clinical Decision Support Track. In TREC. Gaithersburg, MD.Google Scholar
- Lucy Lu Wang, Kyle Lo, Yoganand Chandrasekhar, Russell Reas, Jiangjiang Yang, Darrin Eide, Kathryn Funk, Rodney Kinney, Ziyang Liu, William Merrill, Paul Mooney, Dewey Murdick, Devvret Rishi, Jerry Sheehan, Zhihong Shen, Brandon Stilson, Alex D. Wade, Kuansan Wang, Chris Wilhelm, Boya Xie, Douglas Raymond, Daniel S. Weld, Oren Etzioni, and Sebastian Kohlmeier. 2020. CORD-19: The Covid- 19 Open Research Dataset. In ACL NLP-COVID Workshop. Online. https: //arxiv.org/abs/2004.10706Google Scholar
- Peilin Yang, Hui Fang, and Jimmy Lin. 2017. Anserini: Enabling the Use of Lucene for Information Retrieval Research. In SIGIR. Tokyo, Japan, 1253--1256.Google ScholarDigital Library
Index Terms
- A2A-API: A Prototype for Biomedical Information Retrieval Research and Benchmarking
Recommendations
Science2Cure: A Clinical Trial Search Prototype
SIGIR '21: Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information RetrievalWith the advances in precision medicine, identifying clinical trials relevant to a specific patient profile becomes more challenging. Often very specific molecular-level patient features need to be matched for the trial to be deemed relevant. Clinical ...
Will Sorafenib Help?: Treatment-aware Reranking in Precision Medicine Search
CIKM '21: Proceedings of the 30th ACM International Conference on Information & Knowledge ManagementHigh-quality evidence from the biomedical literature is crucial for decision making of oncologists who treat cancer patients. Search for evidence on a specific treatment for a patient is the challenge set by the precision medicine track of TREC in 2020. ...
A Self-Learning Resource-Efficient Re-Ranking Method for Clinical Trials Search
CIKM '23: Proceedings of the 32nd ACM International Conference on Information and Knowledge ManagementComplex search scenarios, such as those in biomedical settings, can be challenging. One such scenario is matching a patient's profile to relevant clinical trials. There are multiple criteria that should match for a document (clinical trial) to be ...
Comments