Investigating the Impact of Query Representation on Medical Information Retrieval

Peikos, Georgios; Alexander, Daria; Pasi, Gabriella; de Vries, Arjen P.

doi:10.1007/978-3-031-28238-6_42

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13981))

Included in the following conference series:

European Conference on Information Retrieval

Abstract

This study investigates the effect that various patient-related information extracted from unstructured clinical notes has on two different tasks, i.e., patient allocation in clinical trials and medical literature retrieval. Specifically, we combine standard and transformer-based methods to extract entities (e.g., drugs, medical problems), disambiguate their meaning (e.g., family history, negations), or expand them with related medical concepts to synthesize diverse query representations. The empirical evaluation showed that certain query representations positively affect retrieval effectiveness for patient allocation in clinical trials, but no statistically significant improvements have been identified in medical literature retrieval. Across the queries, it has been found that removing negated entities using a domain-specific pre-trained transformer model has been more effective than a standard rule-based approach. In addition, our experiments have shown that removing information related to family history can further improve patient allocation in clinical trials.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 79.99; Price excludes VAT (USA)

Softcover Book: USD 99.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Evaluation of Doc’EDS: a French semantic search tool to query health documents from a clinical data warehouse

Article Open access 08 February 2022

A Rule-Free Approach for Cardiological Registry Filling from Italian Clinical Notes with Question Answering Transformers

A Transformer-Based Framework for Biomedical Information Retrieval Systems

Notes

1.
All indexing parameter combinations were evaluated, however these parameters lead to greater retrieval performance.
2.
https://github.com/inf_extraction_med_ir.

References

Bert-base-uncased clinical NER. https://huggingface.co/samrawal/bert-base-uncased_clinical-ner. Accessed 12 Oct 2022
BioBert. https://github.com/alvaroalon2/bio-nlp/tree/master/models. Accessed 17 Oct 2022
The Thirtieth Text REtrieval Conference (TREC 2021) Proceedings. https://trec.nist.gov/pubs/trec30/trec2021.html. Accessed 03 Oct 2022
Agosti, M., Nunzio, G.M.D., Marchesin, S.: An analysis of query reformulation techniques for precision medicine. In: Piwowarski, B., Chevalier, M., Gaussier, É., Maarek, Y., Nie, J., Scholer, F. (eds.) Proceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 2019, Paris, France, 21–25 July 019, pp. 973–976. ACM (2019). https://doi.org/10.1145/3331184.3331289
van Aken, B., Trajanovska, I., Siu, A., Mayrdorfer, M., Budde, K., Loeser, A.: Assertion detection in clinical notes: medical language models to the rescue? In: Proceedings of the Second Workshop on Natural Language Processing for Medical Conversations. Association for Computational Linguistics (2021). https://aclanthology.org/2021.nlpmc-1.5
Akkasi, A., Varoğlu, E., Dimililer, N.: Chemtok: a new rule based tokenizer for chemical named entity recognition. BioMed Res. Int. (2016). https://doi.org/10.1155/2016/4248026
Article Google Scholar
Alfattni, G., Peek, N., Nenadic, G.: Extraction of temporal relations from clinical free text: a systematic review of current approaches. J. Biomed. Inf. 108, 103488 (2020). https://doi.org/10.1016/j.jbi.2020.103488
Article Google Scholar
Averbuch, M., Karson, T.H., Ben-Ami, B., Maimon, O., Rokach, L.: Context-sensitive medical information retrieval. In: Fieschi, M., Coiera, E.W., Li, Y.J. (eds.) MEDINFO 2004 - Proceedings of the 11th World Congress on Medical Informatics, San Francisco, California, USA, 7–11 September 2004. Studies in Health Technology and Informatics, vol. 107, pp. 282–286. IOS Press (2004). https://doi.org/10.3233/978-1-60750-949-3-282
Balaneshinkordan, S., Kotov, A., Xisto, R.: WSU-IR at TREC 2015 clinical decision support track: joint weighting of explicit and latent medical query concepts from diverse sources. In: Voorhees, E.M., Ellis, A. (eds.) Proceedings of the Twenty-Fourth Text REtrieval Conference, TREC 2015, Gaithersburg, Maryland, USA, 17–20 November 2015. NIST Special Publication, vol. 500–319. National Institute of Standards and Technology (NIST) (2015), http://trec.nist.gov/pubs/trec24/papers/wsu_ir-CL.pdf
Bodenreider, O.: The unified medical language system (umls): integrating biomedical terminology. Nucleic acids Res. 32(suppl_1), D267–D270 (2004)
Google Scholar
Chapman, W.W., Bridewell, W., Hanbury, P., Cooper, G.F., Buchanan, B.G.: Evaluation of negation phrases in narrative clinical reports. In: AMIA 2001, American Medical Informatics Association Annual Symposium, Washington, DC, USA, 3–7 November 2001. AMIA (2001). https://knowledge.amia.org/amia-55142-a2001a-1.597057/t-001-1.599654/f-001-1.599655/a-021-1.600074/a-022-1.600071
Chapman, W.W., Bridewell, W., Hanbury, P., Cooper, G.F., Buchanan, B.G.: A simple algorithm for identifying negated findings and diseases in discharge summaries. J. Biomed. Inf. 34(5), 301–310 (2001)
Article Google Scholar
Dai, X., Rybinski, M., Karimi, S.: Searchehr: a family history search system for clinical decision support. In: Demartini, G., Zuccon, G., Culpepper, J.S., Huang, Z., Tong, H. (eds.) CIKM 2021: The 30th ACM International Conference on Information and Knowledge Management, Virtual Event, Queensland, Australia, 1–5 November 2021, pp. 4701–4705. ACM (2021). https://doi.org/10.1145/3459637.3481986
Dhayne, H., Kilany, R., Haque, R., Taher, Y.: Emr2vec: bridging the gap between patient data and clinical trial. Comput. Ind. Eng. 156, 107236 (2021). https://doi.org/10.1016/j.cie.2021.107236
Article Google Scholar
Eyre, H., et al.: Launching into clinical space with medspacy: a new clinical text processing toolkit in python. In: AMIA Annual Symposium Proceedings, vol. 2021, p. 438. American Medical Informatics Association (2021)
Google Scholar
Gliklich, R.E., Leavy, M.B., Dreyer, N.A.: Tools and technologies for registry interoperability, registries for evaluating patient outcomes: a user’s guide, addendum 2 (2019)
Google Scholar
Harkema, H., Dowling, J.N., Thornblade, T., Chapman, W.W.: Context: an algorithm for determining negation, experiencer, and temporal status from clinical reports. J. Biomed. Inf. 42(5), 839–851 (2009)
Article Google Scholar
Hersh, W.R.: Adding value to the electronic health record through secondary use of data for quality assurance, research, and surveillance. Clin. Pharmacol. Ther. 81, 126–128 (2007)
Google Scholar
Koopman, B., Zuccon, G.: Understanding negation and family history to improve clinical information retrieval. In: Geva, S., Trotman, A., Bruza, P., Clarke, C.L.A., Järvelin, K. (eds.) The 37th International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 2014, Gold Coast, QLD, Australia - 06–11 July 2014, pp. 971–974. ACM (2014). https://doi.org/10.1145/2600428.2609487
Koopman, B., Zuccon, G.: A test collection for matching patients to clinical trials. In: Perego, R., Sebastiani, F., Aslam, J.A., Ruthven, I., Zobel, J. (eds.) Proceedings of the 39th International ACM SIGIR conference on Research and Development in Information Retrieval, SIGIR 2016, Pisa, Italy, 17–21 July 2016, pp. 669–672. ACM (2016). https://doi.org/10.1145/2911451.2914672
Krallinger, M., Leitner, F., Rabal, O., Vazquez, M., Oyarzabal, J., Valencia, A.: Chemdner: The drugs and chemical names extraction challenge. J. Cheminf. 7, 1–11 (2015)
Google Scholar
Leaman, R., Islamaj, R., Lu, Z.: The overview of the NLM-Chem BioCreative VII track: full-text chemical identification and indexing in PubMed articles. In: BioCreative VII Challenge Evaluation Workshop, pp. 108–113 (2021)
Google Scholar
Lee, J., Yoon, W., Kim, S., Kim, D., Kim, S., Ho So, C., Kang, J.: Biobert: a pre-trained biomedical language representation model for biomedical text mining. Bioinformatics 36(4), 1234–1240 (2020). https://doi.org/10.1093/bioinformatics/btz682
Article Google Scholar
Lee, J., et al.: Biobert: a pre-trained biomedical language representation model for biomedical text mining. Bioinform. 36(4), 1234–1240 (2020). https://doi.org/10.1093/bioinformatics/btz682
Article Google Scholar
Luo, L., et al.: An attention-based bilstm-crf approach to document-level chemical named entity recognition. Bioinformatics (Oxford, England) 34 (2017). https://doi.org/10.1093/bioinformatics/btx761
MacAvaney, S., Yates, A., Feldman, S., Downey, D., Cohan, A., Goharian, N.: Simplified data wrangling with ir_datasets. In: SIGIR (2021)
Google Scholar
Macdonald, C., Tonellotto, N.: Declarative experimentation ininformation retrieval using pyterrier. In: Proceedings of ICTIR 2020 (2020)
Google Scholar
Neumann, M., King, D., Beltagy, I., Ammar, W.: ScispaCy: fast and robust models for biomedical natural language processing. In: Proceedings of the 18th BioNLP Workshop and Shared Task, pp. 319–327. Association for Computational Linguistics, Florence, Italy, August 2019. https://doi.org/10.18653/v1/W19-5034, https://www.aclweb.org/anthology/W19-5034
Pradeep, R., Li, Y., Wang, Y., Lin, J.: Neural query synthesis and domain-specific ranking templates for multi-stage clinical trial matching. In: Proceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 2325–2330. SIGIR 2022, Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3477495.3531853
Roberts, K., Simpson, M.S., Demner-Fushman, D., Voorhees, E.M., Hersh, W.R.: State-of-the-art in biomedical literature retrieval for clinical cases: a survey of the TREC 2014 CDS track. Inf. Retr. J. 19(1-2), 113–148 (2016). https://doi.org/10.1007/s10791-015-9259-x
Roberts, K., Simpson, M.S., Voorhees, E.M., Hersh, W.R.: Overview of the TREC 2015 clinical decision support track. In: Voorhees, E.M., Ellis, A. (eds.) Proceedings of The Twenty-Fourth Text REtrieval Conference, TREC 2015, Gaithersburg, Maryland, USA, 17–20 November 2015. NIST Special Publication, vol. 500–319. National Institute of Standards and Technology (NIST) (2015). http://trec.nist.gov/pubs/trec24/papers/Overview-CL.pdf
Robertson, S.E., Walker, S., Jones, S., Hancock-Beaulieu, M., Gatford, M.: Okapi at TREC-3. In: Harman, D.K. (ed.) Proceedings of The Third Text REtrieval Conference, TREC 1994, Gaithersburg, Maryland, USA, November 2–4, 1994. NIST Special Publication, vol. 500–225, pp. 109–126. National Institute of Standards and Technology (NIST) (1994). http://trec.nist.gov/pubs/trec3/papers/city.ps.gz
Rybinski, M., Dai, X., Singh, S., Karimi, S., Nguyen, A., et al.: Extracting family history information from electronic health records: natural language processing analysis. JMIR Med. Inf. 9(4), e24020 (2021)
Article Google Scholar
Simpson, M.S., Voorhees, E.M., Hersh, W.R.: Overview of the TREC 2014 clinical decision support track. In: Voorhees, E.M., Ellis, A. (eds.) Proceedings of The Twenty-Third Text REtrieval Conference, TREC 2014, Gaithersburg, Maryland, USA, 19–21 November 2014. NIST Special Publication, vol. 500–308. National Institute of Standards and Technology (NIST) (2014). https://trec.nist.gov/pubs/trec23/papers/overview-clinical.pdf
Soboroff, I.: Overview of trec 2021. In: 30th Text REtrieval Conference. Gaithersburg, Maryland (2021)
Google Scholar
Tikk, D., Solt, I.: Improving textual medication extraction using combined conditional random fields and rule-based systems, journal of the american medical informatics association. J. Am. Med. Inf. Assoc. 17, 540–544 (2010). https://doi.org/10.1136/jamia.2010.004119
Article Google Scholar
Uzuner, Ö., South, B.R., Shen, S., DuVall, S.L.: 2010 i2b2/va challenge on concepts, assertions, and relations in clinical text. J. Am. Med. Inf. Assoc. 18(5), 552–556 (2011). https://doi.org/10.1136/amiajnl-2011-000203
Article Google Scholar
Xu, B., Xiufeng, S., Zhao, Z., Zheng, W.: Leveraging biomedical resources in bi-lstm for drug drug interaction extraction. IEEE Access 1 (2018). https://doi.org/10.1109/ACCESS.2018.2845840
Zhang, Y., Zhang, Y., Qi, P., Manning, C.D., Langlotz, C.P.: Biomedical and clinical English model packages for the Stanza Python NLP library. J. Am. Med. Inf. Assoc. 28(9), 1892–1899 (2021)
Google Scholar

Download references

Acknowledgements

This work was supported by the EU Horizon 2020 ITN/ETN on Domain Specific Systems for Information Extraction and Retrieval (H2020-EU.1.3.1., ID: 860721).

Author information

Authors and Affiliations

University of Milano-Bicocca, Milan, Italy
Georgios Peikos & Gabriella Pasi
Radboud University, Nijmegen, The Netherlands
Daria Alexander & Arjen P. de Vries
Spinque, Utrecht, The Netherlands
Daria Alexander

Authors

Georgios Peikos
View author publications
You can also search for this author in PubMed Google Scholar
Daria Alexander
View author publications
You can also search for this author in PubMed Google Scholar
Gabriella Pasi
View author publications
You can also search for this author in PubMed Google Scholar
Arjen P. de Vries
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Georgios Peikos .

Editor information

Editors and Affiliations

University of Amsterdam, Amsterdam, The Netherlands
Jaap Kamps
Université Grenoble-Alpes, Saint-Martin-d’Hères, France
Lorraine Goeuriot
Università della Svizzera Italiana, Lugano, Switzerland
Fabio Crestani
University of Copenhagen, Copenhagen, Denmark
Maria Maistro
University of Tsukuba, Ibaraki, Japan
Hideo Joho
Dublin City University, Dublin, Ireland
Brian Davis
Dublin City University, Dublin, Ireland
Cathal Gurrin
Universität Regensburg, Regensburg, Germany
Udo Kruschwitz
Dublin City University, Dublin, Ireland
Annalina Caputo

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Peikos, G., Alexander, D., Pasi, G., de Vries, A.P. (2023). Investigating the Impact of Query Representation on Medical Information Retrieval. In: Kamps, J., et al. Advances in Information Retrieval. ECIR 2023. Lecture Notes in Computer Science, vol 13981. Springer, Cham. https://doi.org/10.1007/978-3-031-28238-6_42

Download citation

DOI: https://doi.org/10.1007/978-3-031-28238-6_42
Published: 17 March 2023
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-28237-9
Online ISBN: 978-3-031-28238-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Investigating the Impact of Query Representation on Medical Information Retrieval

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Evaluation of Doc’EDS: a French semantic search tool to query health documents from a clinical data warehouse

A Rule-Free Approach for Cardiological Registry Filling from Italian Clinical Notes with Question Answering Transformers

A Transformer-Based Framework for Biomedical Information Retrieval Systems

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

Investigating the Impact of Query Representation on Medical Information Retrieval

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Evaluation of Doc’EDS: a French semantic search tool to query health documents from a clinical data warehouse

A Rule-Free Approach for Cardiological Registry Filling from Italian Clinical Notes with Question Answering Transformers

A Transformer-Based Framework for Biomedical Information Retrieval Systems

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation