research-article

MedRetriever: Target-Driven Interpretable Health Risk Prediction via Retrieving Unstructured Medical Text

Authors:

Fenglong MaAuthors Info & Claims

CIKM '21: Proceedings of the 30th ACM International Conference on Information & Knowledge Management

Pages 2414 - 2423

https://doi.org/10.1145/3459637.3482273

Published: 30 October 2021 Publication History

Get Access

Abstract

The broad adoption of electronic health record (EHR) systems and the advances of deep learning technology have motivated the development of health risk prediction models, which mainly depend on the expressiveness and temporal modeling capacity of deep neural networks (DNNs) to improve prediction performance. Some further augment the prediction by using external knowledge, however, a great deal of EHR information inevitably loses during the knowledge mapping. In addition, prediction made by existing models usually lacks reliable interpretation, which undermines their reliability in guiding clinical decision-making. To solve these challenges, we propose MedRetriever, an effective and flexible framework that leverages unstructured medical text collected from authoritative websites to augment health risk prediction as well as to provide understandable interpretation. Besides, MedRetriever explicitly takes the target disease documents into consideration, which provide key guidance for the model to learn in a target-driven direction, i.e., from the target disease to the input EHR. To specify, MedRetriever can flexibly choose its backbone from major predictive models to learn the EHR embedding for each visit. After that, the EHR embedding and features of target disease documents are aggregated into a query by self-attention to retrieve highly relevant text segments from the medical text pool, which is stored in the dynamically updated text memory. Finally, the comprehensive EHR embedding and the text memory are used for prediction and interpretation. We evaluate MedRetriever against nine state-of-the-art approaches across three real-world EHR datasets, which consistently achieves the best performance in AUC and recall metrics and outperforms the best baseline by at least 4.8% in recall on three test datasets. Furthermore, we conduct case studies to show the easy-to-understand interpretation by MedRetriever.

Supplementary Material

MP4 File (CIKM21-rgfp0552.mp4)

In this presentation, the authors will cover the proposed MedRetriever structure for the health risk prediction task. Firstly, they will introduce the background of health risk prediction and the gap between existing methods and the expectation for reliable and understandable interpretability. After that, they will talk about the unstructured medical text and how the proposed MedRetriever can use this type of data to improve the health risk prediction performance. Finally, the authors will discuss the experimental results and summary our work.

Download
51.31 MB

References

[1]

Tian Bai, Shanshan Zhang, Brian L Egleston, and Slobodan Vucetic. 2018. Interpretable representation learning for healthcare via capturing disease progression through time. In Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. 43--51.

Abstract

Supplementary Material

References

Cited By

Index Terms

Recommendations

Implementing the lifelong personal health record in a regionalised health information system: The case of Lombardy, Italy

Learning from Swedish Healthcare Data

Pharmacovigilance via Baseline Regularization with Large-Scale Longitudinal Observational Data

Comments

Information

Published In

Sponsors

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Upcoming Conference

Contributors

Other Metrics

Bibliometrics

Article Metrics

Other Metrics

Citations

Cited By

Login options

Full Access

View options

PDF

eReader

Share

Share this Publication link

Share on social media

Affiliations