short-paper

SearchEHR: A Family History Search System for Clinical Decision Support

Authors:
Xiang Dai

CSIRO, Data61, Sydney, Australia

CSIRO, Data61, Sydney, Australia
View Profile

,
Maciej Rybinski

CSIRO, Data61, Sydney, Australia

CSIRO, Data61, Sydney, Australia
View Profile

,
Sarvnaz Karimi

CSIRO, Data61, Sydney, Australia

CSIRO, Data61, Sydney, Australia
View Profile

CIKM '21: Proceedings of the 30th ACM International Conference on Information & Knowledge ManagementOctober 2021Pages 4701–4705https://doi.org/10.1145/3459637.3481986

Published:30 October 2021Publication History

CIKM '21: Proceedings of the 30th ACM International Conference on Information & Knowledge Management

Pages 4701–4705

ABSTRACT

Finding patients with specific clinical conditions, such as having a familial disease history of diabetes, is an important task for clinical decision support. Clinical notes in Electronic Health Records (EHR), which document the patient medical history and familial disease history, are valuable resources for patient cohort selection. However, such information is difficult to discover in clinical text, and full-text search techniques often fail due to the unique characteristics of clinical language. We describe a system---SearchEHR---that combines Natural Language Processing (NLP) and Information Retrieval (IR) techniques to facilitate utilising clinical notes to find cohorts of patients, with a special focus on family disease history.

Supplemental Material

SEARCHEHR_2 (1).mp4

mp4

8.3 MB

Download

References

Belden J. Botkin M. Kochendorfer K. Kruse R. Strecker D. Alafaireet, P. and J. Williams. 2017. Embedding a Medical Search Engine Within an Electronic Health Record. Missouri medicine, Vol. 114, 4 (2017).Google Scholar
Emily Alsentzer, John R Murphy, Willie Boag, Wei-Hung Weng, Di Jin, Tristan Naumann, and Matthew McDermott. 2019. Publicly available clinical BERT embeddings. In Proceedings of the 2nd Clinical Natural Language Processing Workshop. Minneapolis, Minnesota, 72--78.Google ScholarCross Ref
Rui Antunes, João Figueira Silva, Arnaldo Pereira, and Sérgio Matos. 2019. Rule-based and Machine Learning Hybrid System for Patient Cohort Selection. In International Conference on Health Informatics. Prague, Czech Republic, 59--67.Google Scholar
Daniel Cer, Yinfei Yang, Sheng yi Kong, Nan Hua, Nicole Limtiaco, Rhomni St. John, Noah Constant, Mario Guajardo-Cespedes, Steve Yuan, Chris Tar, Brian Strope, and Ray Kurzweil. 2018. Universal Sentence Encoder for English. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing (System Demonstrations). Brussels, Belgium, 169--174.Google ScholarCross Ref
Sungbin Choi, Jinwook Choi, Sooyoung Yoo, Heechun Kim, and Youngho Lee. 2014. Semantic concept-enriched dependence model for medical information retrieval. Journal of biomedical informatics (2014), 18--27.Google Scholar
Xiang Dai, Sarvnaz Karimi, Ben Hachey, and Cecile Paris. 2020. Cost-effective Selection of Pretraining Data: A Case Study of Pretraining BERT on Social Media. In Findings of the Association for Computational Linguistics: EMNLP 2020. Online, 1675--1681.Google Scholar
Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2019. BERT: Pre-training of deep bidirectional transformers for language understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers). Minneapolis, Minnesota, 4171--4186.Google Scholar
Rezarta Islamaj Doug an, Robert Leaman, and Zhiyong Lu. 2014. NCBI disease corpus: a resource for disease name recognition and concept normalization. Journal of biomedical informatics (2014), 1--10. Google ScholarDigital Library
Erik Faessler and Michel Oleynik. 2019. JULIE Lab at the 2019 TREC Precision Medicine Track. In TREC. Gaithersburg, MD.Google Scholar
David B Fogel. 2018. Factors associated with clinical trials that fail and opportunities for improving the likelihood of success: a review. Contemporary clinical trials communications (2018), 156--164.Google ScholarCross Ref
Google. 2019. Google Health. https://www.youtube.com/watch?v=P3SYqcPXqNk. [Online; accessed 10-Apr-2021].Google Scholar
Suchin Gururangan, Ana Marasovi?, Swabha Swayamdipta, Kyle Lo, Iz Beltagy, Doug Downey, and Noah A Smith. 2020. Don't Stop Pretraining: Adapt Language Models to Domains and Tasks. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. Online, 8342--8360.Google ScholarCross Ref
Hamed Hassanzadeh, Sarvnaz Karimi, and Anthony Nguyen. 2020. Matching patients to clinical trials using semantically enriched document representation. Journal of Biomedical Informatics, Vol. 105 (2020), 103406.Google ScholarCross Ref
Sam Henry, Yanshan Wang, Feichen Shen, and Ozlem Uzuner. 2020. The 2019 National Natural language processing (NLP) Clinical Challenges (n2c2)/Open Health NLP (OHNLP) shared task on clinical concept normalization for clinical records. Journal of the American Medical Informatics Association (2020), 1529--1537.Google Scholar
Richard Jackson, Ismail Kartoglu, Clive Stringer, Genevieve Gorrell, Angus Roberts, Xingyi Song, Honghan Wu, Asha Agrawal, Kenneth Lui, Tudor Groza, et al. 2018. CogStack-experiences of deploying integrated information retrieval and extraction services in a large National Health Service Foundation Trust hospital. BMC medical informatics and decision making, Vol. 18, 1 (2018), 1--13.Google Scholar
Sravya Kakumanu, Braden Manns, Sophia Tran, Terry Saunders-Smith, Brenda Hemmelgarn, Marcello Tonelli, Ross Tsuyuki, Noah Ivers, Danielle Southern, Jeff Bakal, and David Campbell. 2019. Cost analysis and efficacy of recruitment strategies used in a large pragmatic community-based clinical trial targeting low-income seniors: a comparative descriptive analysis. Trials, Vol. 20, 577 (2019).Google Scholar
NCBI. 2021. MedGen. https://www.ncbi.nlm.nih.gov/medgen/. [Online; accessed 10-Apr-2021].Google Scholar
NLM. 2021. UMLS. https://www.nlm.nih.gov/research/umls/index.html. [Online; accessed 10-Apr-2021].Google Scholar
Michel Oleynik, Amila Kugic, Zdenko Kasac, and Markus Kreuzthaler. 2019. Evaluating shallow and deep learning strategies for the 2018 n2c2 shared task on clinical text classification. Journal of the American Medical Informatics Association (2019), 1247--1254.Google Scholar
Catherine Plaisant, Stanley Lam, Ben Shneiderman, Mark S. Smith, David Roseman, Greg Marchand, Michael Gillam, Craig Feied, Jonathan Handler, and Hank Rappaport. 2008. Searching electronic health records for temporal patterns in patient histories: A case study with microsoft amalga. In AMIA annual symposium proceedings, Vol. 2008. 601.Google Scholar
Yada Pruksachatkun, Jason Phang, Haokun Liu, Phu Mon Htut, Xiaoyi Zhang, Richard Yuanzhe Pang, Clara Vania, Katharina Kann, and Samuel R Bowman. 2020. Intermediate-Task Transfer Learning with Pretrained Models for Natural Language Understanding: When and Why Does It Work?. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. Online, 5231--5247.Google ScholarCross Ref
Kirk Roberts, Dina Demner-Fushman, Ellen Voorhees, William R. Hersh, Steven Bedrick, Alexander Lazar, and Shubham Pant. 2017. Overview of the TREC 2017 Precision Medicine Track. In TREC. Gaithersburg, MD.Google Scholar
Kirk Roberts, Dina Demner-Fushman, Ellen M. Voorhees, Steven Bedrick, and William R. Hersh. 2021. Overview of the TREC 2020 Precision Medicine Track. In (To appear in) TREC. Gaithersburg, MD.Google Scholar
Kirk Roberts, Dina Demner-Fushman, Ellen M. Voorhees, William R. Hersh, Steven Bedrick, and Alexander J. Lazar. 2018. Overview of the TREC 2018 Precision Medicine Track. In TREC. Gaithersburg, MD.Google Scholar
Kirk Roberts, Dina Demner-Fushman, Ellen M. Voorhees, William R. Hersh, Steven Bedrick, Alexander J. Lazar, Shubham Pant, and Funda Meric-Bernstam. 2019. Overview of the TREC 2019 Precision Medicine Track. In TREC. Gaithersburg, MD.Google Scholar
Stephen Robertson, Steve Walker, Susan Jones, Micheline Hancock-Beaulieu, and Mike Gatford. 1995. Okapi at TREC-3. In TREC. Gaithersburg, MD, US. https://trec.nist.gov/pubs/trec3/t3_proceedings.htmlGoogle Scholar
Maciej Rybinski, Xiang Dai, Sonit Singh, Sarvnaz Karimi, and Anthony Nguyen. 2021. Extracting Family History Information From Electronic Health Records: Natural Language Processing Analysis. JMIR Medical Informatics, Vol. 9, 5 (2021), e30153.Google ScholarCross Ref
David L Sackett, William MC Rosenberg, JA Muir Gray, R Brian Haynes, and W Scott Richardson. 1996. Evidence based medicine: what it is and what it isn't. The BMJ (1996).Google Scholar
Feichen Shen, Sijia Liu, Sunyang Fu, Yanshan Wang, Sam Henry, Ozlem Uzuner, and Hongfang Liu. 2021. Family History Extraction From Synthetic Clinical Narratives Using Natural Language Processing: Overview and Evaluation of a Challenge Data Set and Solutions for the 2019 National NLP Clinical Challenges (n2c2)/Open Health Natural Language Processing (OHNLP) Competition. JMIR Medical Informatics (2021), e24008.Google Scholar
SNOMED. 2021. SNOMED CT. https://www.snomed.org/snomed-ct/five-step-briefing. [Online; accessed 10-Apr-2021].Google Scholar
Amber Stubbs, Michele Filannino, Ergin Soysal, Samuel Henry, and Özlem Uzuner. 2019. Cohort selection for clinical trials: N2C2 2018 shared task track 1. Journal of the American Medical Informatics Association (2019), 1163--1171.Google Scholar
Charles Sutton and Andrew McCallum. 2007. An introduction to conditional random fields for relational learning. The MIT Press.Google Scholar
Ellen M Voorhees and William R Hersh. 2012. Overview of the TREC 2012 Medical Records Track. In Text REtrieval Conference.Google Scholar
Yue Wang, Xitong Liu, and Hui Fang. 2014. A study of concept-based weighting regularization for medical records search. In Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics. Baltimore, Maryland, 603--612.Google ScholarCross Ref
Yeming Wang, Dingyu Zhang, Guanhua Du, Ronghui Du, Jianping Zhao, Yang Jin, Shouzhi Fu, Ling Gao, Zhenshun Cheng, and Qiaofa Lu. 2020. Remdesivir in adults with severe COVID-19: a randomised, double-blind, placebo-controlled, multicentre trial. The Lancet (2020), 1569--1578.Google Scholar
Honghan Wu, Giulia Toti, Katherine I Morley, Zina Ibrahim, Amos Folarin, Ismail Kartoglu, Richard Jackson, Asha Agrawal, Clive Stringer, Darren Gale, et al. 2017. SemEHR: surfacing semantic data from clinical notes in electronic health records for tailored care, trial recruitment, and clinical research. The Lancet, Vol. 390 (2017), S97.Google ScholarCross Ref
Xuesi Zhou, Xin Chen, Jian Song, Gang Zhao, and Ji Wu. 2018. Team Cat-Garfield at TREC 2018 Precision Medicine Track. In TREC,, Ellen M. Voorhees and Angela Ellis (Eds.). Gaithersburg, MD.Google Scholar

Index Terms

SearchEHR: A Family History Search System for Clinical Decision Support
1. Information systems
  1. Information retrieval
    1. Retrieval tasks and goals
      1. Information extraction
  2. Information systems applications
    1. Decision support systems
      1. Expert systems

Recommendations

Cooperative Epistemic Work in Medical Practice: An Analysis of Physicians' Clinical Notes

We examine an important part of the medical record that has not been studied extensively: physicians' clinical notes. These notes constitute an explanatory medical narrative that documents the patient's illness trajectory by combining each physician's ...
Read More
Experiencer Detection and Automated Extraction of a Family Disease Tree from Medical Texts in Russian Language
Computational Science – ICCS 2020
Abstract
Text descriptions in natural language are an essential part of electronic health records (EHRs). Such descriptions usually contain facts about patient’s life, events, diseases and other relevant information. Sometimes it may also include facts ...
Read More
Identification of pediatric respiratory diseases using a fine-grained diagnosis system
Graphical abstract

Display Omitted
Highlights
- Diagnosing respiratory diseases in pediatrics with clinical notes is possible.
- ...
Abstract
Respiratory diseases, including asthma, bronchitis, pneumonia, and upper respiratory tract infection (RTI), are among the most common diseases in clinics. The similarities among the symptoms of these diseases precludes prompt diagnosis ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
CIKM '21: Proceedings of the 30th ACM International Conference on Information & Knowledge Management
October 2021
4966 pages
ISBN:9781450384469
DOI:10.1145/3459637
General Chairs:
Gianluca Demartini
The University of Queensland, Australia
,
Guido Zuccon
The University of Queensland, Australia
,
Program Chairs:
J. Shane Culpepper
RMIT University, Australia
,
Zi Huang
The University of Queensland, Australia
,
Hanghang Tong
University of Illinois at Urbana-Champaign, USA
Copyright © 2021 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 30 October 2021
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
clinical notes
cohort selection
family history
information extraction
information retrieval
Qualifiers
- short-paper
Conference

Acceptance Rates
Overall Acceptance Rate1,861of8,427submissions,22%
Upcoming Conference
CIKM '24

Sponsor:

sigir

sigir

The 33rd ACM International Conference on Information and Knowledge Management

October 21 - 25, 2024

Boise , ID , USA
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 1
  Total Citations
  View Citations
- 167
  Total Downloads
- Downloads (Last 12 months)45
- Downloads (Last 6 weeks)6
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

SearchEHR: A Family History Search System for Clinical Decision Support

CIKM '21: Proceedings of the 30th ACM International Conference on Information & Knowledge Management

ABSTRACT

Supplemental Material

References

Cited By

Index Terms

Recommendations

Cooperative Epistemic Work in Medical Practice: An Analysis of Physicians' Clinical Notes

Experiencer Detection and Automated Extraction of a Family Disease Tree from Medical Texts in Russian Language

Identification of pediatric respiratory diseases using a fine-grained diagnosis system