Abstract
Due to introduction and increased availability of Electronic Health Records (EHRs), disease prediction has recently gained immense research attention and achieved impressive progress. Existing methods are based on RNN-like architectures, which treat every disease equally, and learn the representations from medical knowledge. However, strong structural information among diseases is ignored in these methods. In this paper, we introduce a novel Path-based reasoning model for self-AttentIonal Disease prediction (ProAID), which utilizes medical paths extracted from patient EHR and external medical knowledge bases to augment the latent interaction between diseases and learn highly representative patient embeddings. By explicitly incorporating medical paths, ProAID effectively generates embeddings that capture the hierarchical information of diseases and learn effective representations of a patient based on the historical patient admission sequences in her/his EHRs to allow accurate disease prediction for the next hospital admission. Extensive experiments on public medical datasets show that ProAID achieves better performance than the compared methods, which indicates the effectiveness of the proposed model.
Similar content being viewed by others
References
Wu X, Chen H, Wu G, Liu J, Zheng Q, He X, Zhou A, Zhao ZQ, Wei B, Gao M, Li Y, Zhang Q, Zhang S, Lu R, Zheng N (2015) Knowledge Engineering with Big Data. IEEE Intell Syst 30(5):46. https://doi.org/10.1109/MIS.2015.56
Yuan B, Chen H, Yao X (2017) Optimal relay placement for lifetime maximization in wireless underground sensor networks. Inf Sci 418:463
Chen H, Tiňo P, Rodan A, Yao X (2014) Learning in the model space for cognitive fault diagnosis. IEEE Trans Neural Netw Learn Syst 25(1):124. https://doi.org/10.1109/TNNLS.2013.2256797
Chen H, Tino P, Yao X (2009) Probabilistic classification vector machines. IEEE Trans Neural Netw 20(6):901. https://doi.org/10.1109/TNN.2009.2014161
Chen H, Tiňo P, Yao X (2014) Efficient probabilistic classification vector machine with incremental basis function selection. IEEE Trans Neural Netw Learn Syst 25(2):356. https://doi.org/10.1109/TNNLS.2013.2275077
Jiang B, Chen H, Yuan B, Yao X (2017) Scalable graph-based semi-supervised learning through sparse bayesian model. IEEE Trans Knowl Data Eng 29(12):2758. https://doi.org/10.1109/TKDE.2017.2749574
Hillestad R, Bigelow J, Bower A, Girosi F, Meili R, Scoville R, Taylor R (2005) Can electronic medical record systems transform health care? Potential health benefits, savings, and costs. Health Aff 24(5):1103
Liu C, Wang F, Hu J, Xiong H (2015) Temporal phenotyping from longitudinal electronic health records: A graph based framework. Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining
Cheng Y, Wang F, Zhang P, Hu J (2016) Risk prediction with electronic health records: A deep learning approach. Proceedings of the 2016 SIAM International Conference on Data Mining. Society for Industrial and Applied Mathematics
Choi E, Bahadori MT, Searles E, Coffey C, Thompson M, Bost J, Tejedor-Sojo J, Sun J (2016) Multi-layer representation learning for medical concepts. Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining
Cai X, Gao J, Ngiam KY, Ooi BC, Zhang Y, Yuan X (2018) Medical concept embedding with time-aware attention. IJCAI 2018
Choi E, Bahadori M.T, Sun J, Kulas J, Schuetz A, Stewart W (2016) Retain: An interpretable predictive model for healthcare using reverse time attention mechanism. arXiv preprint arXiv:1608.05745
Ma F, Chitta R, Zhou J, You Q, Sun T, Gao J (2017) Dipole: Diagnosis prediction in healthcare via attention-based bidirectional recurrent neural networks. Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining
Sun Z, Yin H, Chen H, Chen T, Cui L, Yang F (2020) Disease prediction via graph neural networks. IEEE Journal of Biomedical and Health Informatics
Lyu S, Liu J (2020) Hybrid framework of convolution and recurrent neural networks for text classification. In 2020 IEEE International Conference on Knowledge Graph (ICKG), pp. 313–320 https://doi.org/10.1109/ICBK50248.2020.00052
Choi E, Bahadori M.T, Song L, Stewart W.F, Sun J (2017) GRAM: graph-based attention model for healthcare representation learning. Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining
Ma F, You Q, Xiao H, Chitta R, Zhou J, Gao J (2018) Kame: Knowledge-based attention model for diagnosis prediction in healthcare. Proceedings of the 27th ACM International Conference on Information and Knowledge Management
Gao J, Wang X, Wang Y, Yang Z, Gao J, Wang J, Tang W, Xie X (2019) Camp: Co-attention memory networks for diagnosis prediction in healthcare. 2019 IEEE International Conference on Data Mining (ICDM). IEEE, 2019
Jensen PB, Jensen LJ, Brunak S (2012) Mining electronic health records: towards better research applications and clinical care. Nat Rev Genet 13(6):395
Mikolov T, Chen K, Corrado G, Dean J (2013) Efficient estimation of word representations in vector space. ICLR (Workshop Poster) 2013
Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser Ł, Polosukhin I (2017) Attention is all you need. Advances in Neural Information Processing Systems. 2017
Devlin J, Chang MW, Lee K, Toutanova K (2019) Bert: Pre-training of deep bidirectional transformers for language understanding. NAACL-HLT 1:2019
Lyu S, Cheng J, Wu X, Cui L, Chen H, Miao C (2020) Auxiliary Learning for Relation Extraction. pp 1–10. https://doi.org/10.1109/TETCI.2020.3040444
Sun Y, Han J, Yan X, Yu PS, Wu T (2011) Pathsim: Meta path-based top-k similarity search in heterogeneous information networks. Proceed VLDB Endow 4(11):992
Lyu S, Chen H (2021) Findings of the Association for Computational Linguistics: ACL 2021
Hu B, Shi C, Zhao WX, Yu PS (2018) Leveraging meta-path based context for top-n recommendation with a neural co-attention model. Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining
Wang X, Wang D, Xu C, He X, Cao Y, Chua TS (2019) Explainable reasoning over knowledge graphs for recommendation. Proceedings of the AAAI Conference on Artificial Intelligence 33(01)
Qiannan Z, Zhou X, Wu J, Tan J, Guo L (2020) A knowledge-aware attentional reasoning network for recommendation. Proceedings of the AAAI Conference on Artificial Intelligence 34(4)
Che Z, Kale D, Li W, Bahadori MT, Liu Y (2015) Deep computational phenotyping. Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining
Shu W, Yao Y, Chen H (2021) Short isometric shapelet transform for binary time series classification. Knowledge and Information Systems 1–29
Choi E, Du N, Chen R, Song L, Sun J (2015) Constructing disease network and temporal progression model via context-sensitive hawkes process. 2015 IEEE International Conference on Data Mining. IEEE
Choi E, Bahadori MT, Schuetz A, Stewart WF, Sun J (2016) Doctor ai: Predicting clinical events via recurrent neural networks. In Machine Learning for Healthcare Conference 2016
Murphy KP (2012) Machine learning: A probabilistic perspective. The MIT Press, Cambridge
Gao J, Xiao C, Wang Y, Tang W, Glass LM, Sun J (2020) Stagenet: Stage-aware neural networks for health risk prediction. In Proceedings of The Web Conference 2020, pp 530-540
Liu Q, Nickel M, Kiela D (2019) Hyperbolic graph neural networks. arXiv preprint arXiv:1910.12892
Nickel M, Kiela D (2017) Poincare embeddings for learning hierarchical representations. Adv Neural Inf Process Syst 30:6338–6347
Soares RGF, Chen H, Yao X (2017) A Cluster-Based Semisupervised Ensemble for Multiclass Classification. IEEE Trans Emerg Topics Comput Intell 1(6):408. https://doi.org/10.1109/TETCI.2017.2743219
Lyu S, Tian X, Li Y, Jiang B, Chen H (2020) Multiclass Probabilistic Classification Vector Machine. IEEE Trans Neural Netw Learn Syst 31(10):3906. https://doi.org/10.1109/TNNLS.2019.2947309
Johnson AE, Pollard TJ, Shen L, Li-Wei HL, Feng M, Ghassemi M, Moody B, Szolovits P, Celi LA, Mark RG (2016) MIMIC-III, a freely accessible critical care database. Scientific Data 3(1):1
Johnson A, Bulgarelli L, Pollard T, Horng S, Celi L, Mark R (2020) Mimic-iv (version 0.4)
Chung J, Gulcehre C, Cho K, Bengio Y (2014) Empirical evaluation of gated recurrent neural networks on sequence modeling. arXiv preprint arXiv:1412.3555
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Lu, X., Cui, L., Sun, Z. et al. ProAID: path-based reasoning for self-attentional disease prediction. Knowl Inf Syst 63, 3087–3101 (2021). https://doi.org/10.1007/s10115-021-01617-w
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10115-021-01617-w