skip to main content
10.1145/3459637.3481986acmconferencesArticle/Chapter ViewAbstractPublication PagescikmConference Proceedingsconference-collections
short-paper

SearchEHR: A Family History Search System for Clinical Decision Support

Published:30 October 2021Publication History

ABSTRACT

Finding patients with specific clinical conditions, such as having a familial disease history of diabetes, is an important task for clinical decision support. Clinical notes in Electronic Health Records (EHR), which document the patient medical history and familial disease history, are valuable resources for patient cohort selection. However, such information is difficult to discover in clinical text, and full-text search techniques often fail due to the unique characteristics of clinical language. We describe a system---SearchEHR---that combines Natural Language Processing (NLP) and Information Retrieval (IR) techniques to facilitate utilising clinical notes to find cohorts of patients, with a special focus on family disease history.

Skip Supplemental Material Section

Supplemental Material

SEARCHEHR_2 (1).mp4

mp4

8.3 MB

References

  1. Belden J. Botkin M. Kochendorfer K. Kruse R. Strecker D. Alafaireet, P. and J. Williams. 2017. Embedding a Medical Search Engine Within an Electronic Health Record. Missouri medicine, Vol. 114, 4 (2017).Google ScholarGoogle Scholar
  2. Emily Alsentzer, John R Murphy, Willie Boag, Wei-Hung Weng, Di Jin, Tristan Naumann, and Matthew McDermott. 2019. Publicly available clinical BERT embeddings. In Proceedings of the 2nd Clinical Natural Language Processing Workshop. Minneapolis, Minnesota, 72--78.Google ScholarGoogle ScholarCross RefCross Ref
  3. Rui Antunes, João Figueira Silva, Arnaldo Pereira, and Sérgio Matos. 2019. Rule-based and Machine Learning Hybrid System for Patient Cohort Selection. In International Conference on Health Informatics. Prague, Czech Republic, 59--67.Google ScholarGoogle Scholar
  4. Daniel Cer, Yinfei Yang, Sheng yi Kong, Nan Hua, Nicole Limtiaco, Rhomni St. John, Noah Constant, Mario Guajardo-Cespedes, Steve Yuan, Chris Tar, Brian Strope, and Ray Kurzweil. 2018. Universal Sentence Encoder for English. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing (System Demonstrations). Brussels, Belgium, 169--174.Google ScholarGoogle ScholarCross RefCross Ref
  5. Sungbin Choi, Jinwook Choi, Sooyoung Yoo, Heechun Kim, and Youngho Lee. 2014. Semantic concept-enriched dependence model for medical information retrieval. Journal of biomedical informatics (2014), 18--27.Google ScholarGoogle Scholar
  6. Xiang Dai, Sarvnaz Karimi, Ben Hachey, and Cecile Paris. 2020. Cost-effective Selection of Pretraining Data: A Case Study of Pretraining BERT on Social Media. In Findings of the Association for Computational Linguistics: EMNLP 2020. Online, 1675--1681.Google ScholarGoogle Scholar
  7. Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2019. BERT: Pre-training of deep bidirectional transformers for language understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers). Minneapolis, Minnesota, 4171--4186.Google ScholarGoogle Scholar
  8. Rezarta Islamaj Doug an, Robert Leaman, and Zhiyong Lu. 2014. NCBI disease corpus: a resource for disease name recognition and concept normalization. Journal of biomedical informatics (2014), 1--10. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Erik Faessler and Michel Oleynik. 2019. JULIE Lab at the 2019 TREC Precision Medicine Track. In TREC. Gaithersburg, MD.Google ScholarGoogle Scholar
  10. David B Fogel. 2018. Factors associated with clinical trials that fail and opportunities for improving the likelihood of success: a review. Contemporary clinical trials communications (2018), 156--164.Google ScholarGoogle ScholarCross RefCross Ref
  11. Google. 2019. Google Health. https://www.youtube.com/watch?v=P3SYqcPXqNk. [Online; accessed 10-Apr-2021].Google ScholarGoogle Scholar
  12. Suchin Gururangan, Ana Marasovi?, Swabha Swayamdipta, Kyle Lo, Iz Beltagy, Doug Downey, and Noah A Smith. 2020. Don't Stop Pretraining: Adapt Language Models to Domains and Tasks. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. Online, 8342--8360.Google ScholarGoogle ScholarCross RefCross Ref
  13. Hamed Hassanzadeh, Sarvnaz Karimi, and Anthony Nguyen. 2020. Matching patients to clinical trials using semantically enriched document representation. Journal of Biomedical Informatics, Vol. 105 (2020), 103406.Google ScholarGoogle ScholarCross RefCross Ref
  14. Sam Henry, Yanshan Wang, Feichen Shen, and Ozlem Uzuner. 2020. The 2019 National Natural language processing (NLP) Clinical Challenges (n2c2)/Open Health NLP (OHNLP) shared task on clinical concept normalization for clinical records. Journal of the American Medical Informatics Association (2020), 1529--1537.Google ScholarGoogle Scholar
  15. Richard Jackson, Ismail Kartoglu, Clive Stringer, Genevieve Gorrell, Angus Roberts, Xingyi Song, Honghan Wu, Asha Agrawal, Kenneth Lui, Tudor Groza, et al. 2018. CogStack-experiences of deploying integrated information retrieval and extraction services in a large National Health Service Foundation Trust hospital. BMC medical informatics and decision making, Vol. 18, 1 (2018), 1--13.Google ScholarGoogle Scholar
  16. Sravya Kakumanu, Braden Manns, Sophia Tran, Terry Saunders-Smith, Brenda Hemmelgarn, Marcello Tonelli, Ross Tsuyuki, Noah Ivers, Danielle Southern, Jeff Bakal, and David Campbell. 2019. Cost analysis and efficacy of recruitment strategies used in a large pragmatic community-based clinical trial targeting low-income seniors: a comparative descriptive analysis. Trials, Vol. 20, 577 (2019).Google ScholarGoogle Scholar
  17. NCBI. 2021. MedGen. https://www.ncbi.nlm.nih.gov/medgen/. [Online; accessed 10-Apr-2021].Google ScholarGoogle Scholar
  18. NLM. 2021. UMLS. https://www.nlm.nih.gov/research/umls/index.html. [Online; accessed 10-Apr-2021].Google ScholarGoogle Scholar
  19. Michel Oleynik, Amila Kugic, Zdenko Kasac, and Markus Kreuzthaler. 2019. Evaluating shallow and deep learning strategies for the 2018 n2c2 shared task on clinical text classification. Journal of the American Medical Informatics Association (2019), 1247--1254.Google ScholarGoogle Scholar
  20. Catherine Plaisant, Stanley Lam, Ben Shneiderman, Mark S. Smith, David Roseman, Greg Marchand, Michael Gillam, Craig Feied, Jonathan Handler, and Hank Rappaport. 2008. Searching electronic health records for temporal patterns in patient histories: A case study with microsoft amalga. In AMIA annual symposium proceedings, Vol. 2008. 601.Google ScholarGoogle Scholar
  21. Yada Pruksachatkun, Jason Phang, Haokun Liu, Phu Mon Htut, Xiaoyi Zhang, Richard Yuanzhe Pang, Clara Vania, Katharina Kann, and Samuel R Bowman. 2020. Intermediate-Task Transfer Learning with Pretrained Models for Natural Language Understanding: When and Why Does It Work?. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. Online, 5231--5247.Google ScholarGoogle ScholarCross RefCross Ref
  22. Kirk Roberts, Dina Demner-Fushman, Ellen Voorhees, William R. Hersh, Steven Bedrick, Alexander Lazar, and Shubham Pant. 2017. Overview of the TREC 2017 Precision Medicine Track. In TREC. Gaithersburg, MD.Google ScholarGoogle Scholar
  23. Kirk Roberts, Dina Demner-Fushman, Ellen M. Voorhees, Steven Bedrick, and William R. Hersh. 2021. Overview of the TREC 2020 Precision Medicine Track. In (To appear in) TREC. Gaithersburg, MD.Google ScholarGoogle Scholar
  24. Kirk Roberts, Dina Demner-Fushman, Ellen M. Voorhees, William R. Hersh, Steven Bedrick, and Alexander J. Lazar. 2018. Overview of the TREC 2018 Precision Medicine Track. In TREC. Gaithersburg, MD.Google ScholarGoogle Scholar
  25. Kirk Roberts, Dina Demner-Fushman, Ellen M. Voorhees, William R. Hersh, Steven Bedrick, Alexander J. Lazar, Shubham Pant, and Funda Meric-Bernstam. 2019. Overview of the TREC 2019 Precision Medicine Track. In TREC. Gaithersburg, MD.Google ScholarGoogle Scholar
  26. Stephen Robertson, Steve Walker, Susan Jones, Micheline Hancock-Beaulieu, and Mike Gatford. 1995. Okapi at TREC-3. In TREC. Gaithersburg, MD, US. https://trec.nist.gov/pubs/trec3/t3_proceedings.htmlGoogle ScholarGoogle Scholar
  27. Maciej Rybinski, Xiang Dai, Sonit Singh, Sarvnaz Karimi, and Anthony Nguyen. 2021. Extracting Family History Information From Electronic Health Records: Natural Language Processing Analysis. JMIR Medical Informatics, Vol. 9, 5 (2021), e30153.Google ScholarGoogle ScholarCross RefCross Ref
  28. David L Sackett, William MC Rosenberg, JA Muir Gray, R Brian Haynes, and W Scott Richardson. 1996. Evidence based medicine: what it is and what it isn't. The BMJ (1996).Google ScholarGoogle Scholar
  29. Feichen Shen, Sijia Liu, Sunyang Fu, Yanshan Wang, Sam Henry, Ozlem Uzuner, and Hongfang Liu. 2021. Family History Extraction From Synthetic Clinical Narratives Using Natural Language Processing: Overview and Evaluation of a Challenge Data Set and Solutions for the 2019 National NLP Clinical Challenges (n2c2)/Open Health Natural Language Processing (OHNLP) Competition. JMIR Medical Informatics (2021), e24008.Google ScholarGoogle Scholar
  30. SNOMED. 2021. SNOMED CT. https://www.snomed.org/snomed-ct/five-step-briefing. [Online; accessed 10-Apr-2021].Google ScholarGoogle Scholar
  31. Amber Stubbs, Michele Filannino, Ergin Soysal, Samuel Henry, and Özlem Uzuner. 2019. Cohort selection for clinical trials: N2C2 2018 shared task track 1. Journal of the American Medical Informatics Association (2019), 1163--1171.Google ScholarGoogle Scholar
  32. Charles Sutton and Andrew McCallum. 2007. An introduction to conditional random fields for relational learning. The MIT Press.Google ScholarGoogle Scholar
  33. Ellen M Voorhees and William R Hersh. 2012. Overview of the TREC 2012 Medical Records Track. In Text REtrieval Conference.Google ScholarGoogle Scholar
  34. Yue Wang, Xitong Liu, and Hui Fang. 2014. A study of concept-based weighting regularization for medical records search. In Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics. Baltimore, Maryland, 603--612.Google ScholarGoogle ScholarCross RefCross Ref
  35. Yeming Wang, Dingyu Zhang, Guanhua Du, Ronghui Du, Jianping Zhao, Yang Jin, Shouzhi Fu, Ling Gao, Zhenshun Cheng, and Qiaofa Lu. 2020. Remdesivir in adults with severe COVID-19: a randomised, double-blind, placebo-controlled, multicentre trial. The Lancet (2020), 1569--1578.Google ScholarGoogle Scholar
  36. Honghan Wu, Giulia Toti, Katherine I Morley, Zina Ibrahim, Amos Folarin, Ismail Kartoglu, Richard Jackson, Asha Agrawal, Clive Stringer, Darren Gale, et al. 2017. SemEHR: surfacing semantic data from clinical notes in electronic health records for tailored care, trial recruitment, and clinical research. The Lancet, Vol. 390 (2017), S97.Google ScholarGoogle ScholarCross RefCross Ref
  37. Xuesi Zhou, Xin Chen, Jian Song, Gang Zhao, and Ji Wu. 2018. Team Cat-Garfield at TREC 2018 Precision Medicine Track. In TREC,, Ellen M. Voorhees and Angela Ellis (Eds.). Gaithersburg, MD.Google ScholarGoogle Scholar

Index Terms

  1. SearchEHR: A Family History Search System for Clinical Decision Support

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in
      • Published in

        cover image ACM Conferences
        CIKM '21: Proceedings of the 30th ACM International Conference on Information & Knowledge Management
        October 2021
        4966 pages
        ISBN:9781450384469
        DOI:10.1145/3459637

        Copyright © 2021 ACM

        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 30 October 2021

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • short-paper

        Acceptance Rates

        Overall Acceptance Rate1,861of8,427submissions,22%

        Upcoming Conference

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader