skip to main content
10.1145/3442381.3449860acmconferencesArticle/Chapter ViewAbstractPublication PagesthewebconfConference Proceedingsconference-collections
research-article

MedPath: Augmenting Health Risk Prediction via Medical Knowledge Paths

Published: 03 June 2021 Publication History

Abstract

The broad adoption of electronic health records (EHR) data and the availability of biomedical knowledge graphs (KGs) on the web have provided clinicians and researchers unprecedented resources and opportunities for conducting health risk predictions to improve healthcare quality and medical resource allocation. Existing methods have focused on improving the EHR feature representations using attention mechanisms, time-aware models, or external knowledge. However, they ignore the importance of using personalized information to make predictions. Besides, the reliability of their prediction interpretations needs to be improved since their interpretable attention scores are not explicitly reasoned from disease progression paths. In this paper, we propose MedPath to solve these challenges and augment existing risk prediction models with the ability to use personalized information and provide reliable interpretations inferring from disease progression paths. Firstly, MedPath extracts personalized knowledge graphs (PKGs) containing all possible disease progression paths from observed symptoms to target diseases from a large-scale online medical knowledge graph. Next, to augment existing EHR encoders for achieving better predictions, MedPath learns a PKG embedding by conducting multi-hop message passing from symptom nodes to target disease nodes through a graph neural network encoder. Since MedPath reasons disease progression by paths existing in PKGs, it can provide explicit explanations for the prediction by pointing out how observed symptoms can finally lead to target diseases. Experimental results on three real-world medical datasets show that MedPath is effective in improving the performance of eight state-of-the-art methods with higher F1 scores and AUCs. Our case study also demonstrates that MedPath can greatly improve the explicitness of the risk prediction interpretation.1

References

[1]
Tian Bai, Shanshan Zhang, Brian L Egleston, and Slobodan Vucetic. 2018. Interpretable representation learning for healthcare via capturing disease progression through time. In Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. 43–51.
[2]
Inci M Baytas, Cao Xiao, Xi Zhang, Fei Wang, Anil K Jain, and Jiayu Zhou. 2017. Patient subtyping via time-aware lstm networks. In Proceedings of the 23rd ACM SIGKDD international conference on knowledge discovery and data mining. 65–74.
[3]
Antoine Bordes, Nicolas Usunier, Alberto Garcia-Duran, Jason Weston, and Oksana Yakhnenko. 2013. Translating embeddings for modeling multi-relational data. In Advances in neural information processing systems. 2787–2795.
[4]
Yu Cheng, Fei Wang, Ping Zhang, and Jianying Hu. 2016. Risk prediction with electronic health records: A deep learning approach. In Proceedings of the 2016 SIAM International Conference on Data Mining. SIAM, 432–440.
[5]
Edward Choi, Mohammad Taha Bahadori, Le Song, Walter F Stewart, and Jimeng Sun. 2017. GRAM: graph-based attention model for healthcare representation learning. In Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 787–795.
[6]
Edward Choi, Mohammad Taha Bahadori, Jimeng Sun, Joshua Kulas, Andy Schuetz, and Walter Stewart. 2016. Retain: An interpretable predictive model for healthcare using reverse time attention mechanism. In Advances in Neural Information Processing Systems. 3504–3512.
[7]
Junyoung Chung, Caglar Gulcehre, KyungHyun Cho, and Yoshua Bengio. 2014. Empirical evaluation of gated recurrent neural networks on sequence modeling. arXiv preprint arXiv:1412.3555(2014).
[8]
Kevin Donnelly. 2006. SNOMED-CT: The advanced terminology and coding system for eHealth. Studies in health technology and informatics 121 (2006), 279.
[9]
Patrick Ernst, Amy Siu, and Gerhard Weikum. 2015. Knowlife: a versatile approach for constructing a large knowledge graph for biomedical sciences. BMC bioinformatics 16, 1 (2015), 157.
[10]
Yanlin Feng, Xinyue Chen, Bill Yuchen Lin, Peifeng Wang, Jun Yan, and Xiang Ren. 2020. Scalable Multi-Hop Relational Reasoning for Knowledge-Aware Question Answering. arXiv preprint arXiv:2005.00646(2020).
[11]
Sepp Hochreiter and Jürgen Schmidhuber. 1997. Long short-term memory. Neural computation 9, 8 (1997), 1735–1780.
[12]
Halil Kilicoglu, Marcelo Fiszman, Alejandro Rodriguez, Dongwook Shin, A Ripple, and Thomas C Rindflesch. 2008. Semantic MEDLINE: a web application for managing the results of Searches. In Proceedings of the third international symposium for semantic mining in biomedicine, Vol. 2008. 69–76.
[13]
Thomas N Kipf and Max Welling. 2016. Semi-supervised classification with graph convolutional networks. arXiv preprint arXiv:1609.02907(2016).
[14]
Bum Chul Kwon, Min-Je Choi, Joanne Taery Kim, Edward Choi, Young Bin Kim, Soonwook Kwon, Jimeng Sun, and Jaegul Choo. 2018. Retainvis: Visual analytics with interpretable and interactive recurrent neural networks on electronic medical records. IEEE transactions on visualization and computer graphics 25, 1(2018), 299–309.
[15]
John D. Lafferty, Andrew McCallum, and Fernando C. N. Pereira. 2001. Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data. In Proceedings of the Eighteenth International Conference on Machine Learning(ICML ’01). Morgan Kaufmann Publishers Inc., San Francisco, CA, USA, 282–289. http://dl.acm.org/citation.cfm?id=645530.655813
[16]
Junyu Luo, Muchao Ye, Cao Xiao, and Fenglong Ma. 2020. HiTANet: Hierarchical Time-Aware Attention Networks for Risk Prediction on Electronic Health Records. In Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. 647–656.
[17]
Fenglong Ma, Radha Chitta, Jing Zhou, Quanzeng You, Tong Sun, and Jing Gao. 2017. Dipole: Diagnosis prediction in healthcare via attention-based bidirectional recurrent neural networks. In Proceedings of the 23rd ACM SIGKDD international conference on knowledge discovery and data mining. 1903–1911.
[18]
Fenglong Ma, Jing Gao, Qiuling Suo, Quanzeng You, Jing Zhou, and Aidong Zhang. 2018. Risk prediction on electronic health records with prior medical knowledge. In Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. 1910–1919.
[19]
Fenglong Ma, Yaqing Wang, Houping Xiao, Ye Yuan, Radha Chitta, Jing Zhou, and Jing Gao. 2018. A general framework for diagnosis prediction via incorporating medical code descriptions. In 2018 IEEE International Conference on Bioinformatics and Biomedicine (BIBM). IEEE, 1070–1075.
[20]
Fenglong Ma, Quanzeng You, Houping Xiao, Radha Chitta, Jing Zhou, and Jing Gao. 2018. Kame: Knowledge-based attention model for diagnosis prediction in healthcare. In Proceedings of the 27th ACM International Conference on Information and Knowledge Management. ACM, 743–752.
[21]
Trang Pham, Truyen Tran, Dinh Phung, and Svetha Venkatesh. 2016. Deepcare: A deep dynamic memory model for predictive medicine. In Pacific-Asia Conference on Knowledge Discovery and Data Mining. Springer, 30–41.
[22]
Thomas C Rindflesch, Halil Kilicoglu, Marcelo Fiszman, Graciela Rosemblat, and Dongwook Shin. 2011. Semantic MEDLINE: An advanced information management application for biomedicine. Information Services & Use 31, 1-2 (2011), 15–21.
[23]
Cicero dos Santos, Ming Tan, Bing Xiang, and Bowen Zhou. 2016. Attentive pooling networks. arXiv preprint arXiv:1602.03609(2016).
[24]
Michael Schlichtkrull, Thomas N Kipf, Peter Bloem, Rianne Van Den Berg, Ivan Titov, and Max Welling. 2018. Modeling relational data with graph convolutional networks. In European Semantic Web Conference. Springer, 593–607.
[25]
Robert Sedgewick. 2001. Algorithms in c, part 5: graph algorithms. Pearson Education.
[26]
Huan Song, Deepta Rajan, Jayaraman J Thiagarajan, and Andreas Spanias. 2017. Attend and diagnose: Clinical time series analysis using attention models. arXiv preprint arXiv:1711.03905(2017).
[27]
Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, Łukasz Kaiser, and Illia Polosukhin. 2017. Attention is all you need. In Advances in neural information processing systems. 5998–6008.
[28]
Petar Veličković, Guillem Cucurull, Arantxa Casanova, Adriana Romero, Pietro Lio, and Yoshua Bengio. 2017. Graph attention networks. arXiv preprint arXiv:1710.10903(2017).
[29]
Muchao Ye, Junyu Luo, Cao Xiao, and Fenglong Ma. 2020. LSAN: Modeling Long-term Dependencies and Short-term Correlations with Hierarchical Attention for Risk Prediction. In Proceedings of the 29th ACM International Conference on Information and Knowledge Management.
[30]
Changchang Yin, Rongjian Zhao, Buyue Qian, Xin Lv, and Ping Zhang. 2019. Domain Knowledge guided deep learning with electronic health records. In 2019 IEEE International Conference on Data Mining (ICDM). IEEE, 738–747.
[31]
Xianli Zhang, Buyue Qian, Yang Li, Changchang Yin, Xudong Wang, and Qinghua Zheng. 2019. KnowRisk: An Interpretable Knowledge-Guided Model for Disease Risk Prediction. In 2019 IEEE International Conference on Data Mining (ICDM). IEEE, 1492–1497.

Cited By

View all
  • (2025)Embedding dynamic graph attention mechanism into Clinical Knowledge Graph for enhanced diagnostic accuracyExpert Systems with Applications10.1016/j.eswa.2024.126215267(126215)Online publication date: Apr-2025
  • (2025)Prototype-Guided Contrastive Knowledge Graph Representation Learning for Diagnosis PredictionPattern Recognition and Artificial Intelligence10.1007/978-981-97-8702-9_18(262-275)Online publication date: 8-Feb-2025
  • (2024)EMERGE: Enhancing Multimodal Electronic Health Records Predictive Modeling with Retrieval-Augmented GenerationProceedings of the 33rd ACM International Conference on Information and Knowledge Management10.1145/3627673.3679582(3549-3559)Online publication date: 21-Oct-2024
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
WWW '21: Proceedings of the Web Conference 2021
April 2021
4054 pages
ISBN:9781450383127
DOI:10.1145/3442381
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 03 June 2021

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. graph neural network
  2. healthcare informatics
  3. model interpretability
  4. risk prediction

Qualifiers

  • Research-article
  • Research
  • Refereed limited

Conference

WWW '21
Sponsor:
WWW '21: The Web Conference 2021
April 19 - 23, 2021
Ljubljana, Slovenia

Acceptance Rates

Overall Acceptance Rate 1,899 of 8,196 submissions, 23%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)203
  • Downloads (Last 6 weeks)15
Reflects downloads up to 13 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2025)Embedding dynamic graph attention mechanism into Clinical Knowledge Graph for enhanced diagnostic accuracyExpert Systems with Applications10.1016/j.eswa.2024.126215267(126215)Online publication date: Apr-2025
  • (2025)Prototype-Guided Contrastive Knowledge Graph Representation Learning for Diagnosis PredictionPattern Recognition and Artificial Intelligence10.1007/978-981-97-8702-9_18(262-275)Online publication date: 8-Feb-2025
  • (2024)EMERGE: Enhancing Multimodal Electronic Health Records Predictive Modeling with Retrieval-Augmented GenerationProceedings of the 33rd ACM International Conference on Information and Knowledge Management10.1145/3627673.3679582(3549-3559)Online publication date: 21-Oct-2024
  • (2024)Multi-Channel Hypergraph Network for Sequential Diagnosis Prediction in Healthcare2024 27th International Conference on Computer Supported Cooperative Work in Design (CSCWD)10.1109/CSCWD61410.2024.10580191(2937-2942)Online publication date: 8-May-2024
  • (2024)Knowledge-based dynamic prompt learning for multi-label disease diagnosisKnowledge-Based Systems10.1016/j.knosys.2024.111395286:COnline publication date: 17-Apr-2024
  • (2024)Interpretable Disease Prediction via Path Reasoning over medical knowledge graphs and admission historyKnowledge-Based Systems10.1016/j.knosys.2023.111082281:COnline publication date: 1-Feb-2024
  • (2024)Graph neural networks for clinical risk prediction based on electronic health recordsJournal of Biomedical Informatics10.1016/j.jbi.2024.104616151:COnline publication date: 2-Jul-2024
  • (2024)TransLSTD: Augmenting hierarchical disease risk prediction model with time and context awareness via disease clusteringInformation Systems10.1016/j.is.2024.102390124(102390)Online publication date: Sep-2024
  • (2024)Exploring Named Entity Recognition in Medical Knowledge Graphs with Pre-trained Language Models and Attention MechanismComputer Applications10.1007/978-981-97-9671-7_11(167-182)Online publication date: 31-Dec-2024
  • (2024)Multi-scale and Multi-level Attention Based on External Knowledge in EHRsRecent Challenges in Intelligent Information and Database Systems10.1007/978-981-97-5937-8_10(113-125)Online publication date: 13-Aug-2024
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format.

HTML Format

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media