ABSTRACT
Cold-start diagnosis prediction is a challenging task for AI in healthcare, where often only a few visits per patient and a few observations per disease can be exploited. Although meta-learning is widely adopted to address the data sparsity problem in general domains, directly applying it to healthcare data is less effective, since it is unclear how to capture both the temporal relations in clinical visits and the complicated relations among syndromic diseases for precise personalized diagnosis. To this end, we first propose a novel Meta-learning framework for cold-start diagnosis prediction in healthCare data (MetaCare). By explicitly encoding the effects of disease progress over time as a generalization prior, MetaCare dynamically predicts future diagnosis and timestamp for infrequent patients. Then, to model complicated relations among rare diseases, we propose to utilize domain knowledge of hierarchical relations among diseases, and further perform diagnosis subtyping to mine the latent syndromic relations among diseases. Finally, to tailor the generic meta-learning framework with personalized parameters, we design a hierarchical patient subtyping mechanism and bridge the modeling of both infrequent patients and rare diseases. We term the joint model as MetaCare++. Extensive experiments on two real-world benchmark datasets show significant performance gains brought by MetaCare++, yielding average improvements of 7.71% for diagnosis prediction and 13.94% for diagnosis time prediction over the state-of-the-art baselines.
Supplemental Material
- M. Andrychowicz, M. Denil, S. Gomez, M. W. Hoffman, D. Pfau, T. Schaul, B. Shillingford, and N. De Freitas. Learning to learn by gradient descent by gradient descent. In Advances in Neural Information Processing Systems, 2016.Google ScholarDigital Library
- T. Bai, S. Zhang, B. L. Egleston, and S. Vucetic. Interpretable representation learning for healthcare via capturing disease progression through time. In Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pages 43--51, 2018.Google ScholarDigital Library
- R. T. Chen and D. Duvenaud. Neural networks with cheap differential operators. arXiv preprint arXiv:1912.03579, 2019.Google Scholar
- R. T. Chen, Y. Rubanova, J. Bettencourt, and D. Duvenaud. Neural ordinary differential equations. arXiv preprint arXiv:1806.07366, 2018.Google Scholar
- X. Chen, D. P. Kingma, T. Salimans, Y. Duan, P. Dhariwal, J. Schulman, I. Sutskever, and P. Abbeel. Variational lossy autoencoder. arXiv preprint arXiv:1611.02731, 2016.Google Scholar
- E. Choi, M. T. Bahadori, J. A. Kulas, A. Schuetz, W. F. Stewart, and J. Sun. Retain: An interpretable predictive model for healthcare using reverse time attention mechanism. In Advances in Neural Information Processing Systems, 2016.Google Scholar
- E. Choi, M. T. Bahadori, L. Song, W. F. Stewart, and J. Sun. Gram: graph-based attention model for healthcare representation learning. In Proceedings of the 23rd ACM SIGKDD international conference on knowledge discovery and data mining, pages 787--795, 2017.Google ScholarDigital Library
- M. Dong, F. Yuan, L. Yao, X. Xu, and L. Zhu. Mamo: Memory-augmented meta-optimization for cold-start recommendation. In Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pages 688--697, 2020.Google ScholarDigital Library
- Z. Du, X. Wang, H. Yang, J. Zhou, and J. Tang. Sequential scenario-specific meta learner for online recommendation. In Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pages 2895--2904, 2019.Google ScholarDigital Library
- C. Finn, P. Abbeel, and S. Levine. Model-agnostic meta-learning for fast adaptation of deep networks. In International Conference on Machine Learning, 2017.Google Scholar
- W. Grathwohl, R. T. Chen, J. Bettencourt, I. Sutskever, and D. Duvenaud. Ffjord: Free-form continuous dynamics for scalable reversible generative models. In ICLR, 2019.Google Scholar
- E. Jang, S. Gu, and B. Poole. Categorical reparameterization with gumbel-softmax. arXiv preprint arXiv:1611.01144, 2016.Google Scholar
- A. E. Johnson, T. J. Pollard, L. Shen, H. L. Li-Wei, M. Feng, M. Ghassemi, B. Moody, P. Szolovits, L. A. Celi, and R. G. Mark. Mimic-iii, a freely accessible critical care database. Scientific data, 2016.Google Scholar
- H.-C. Kao, K.-F. Tang, and E. Chang. Context-aware symptom checking for disease diagnosis using hierarchical reinforcement learning. In Proceedings of the AAAI Conference on Artificial Intelligence, 2018.Google ScholarCross Ref
- D. P. Kingma, T. Salimans, and M. Welling. Variational dropout and the local reparameterization trick. Advances in Neural Information Processing Systems, 2015.Google Scholar
- H. Lee, J. Im, S. Jang, H. Cho, and S. Chung. Melu: Meta-learned user preference estimator for cold-start recommendation. In Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pages 1073--1082, 2019.Google ScholarDigital Library
- A. Likas, N. Vlassis, and J. J. Verbeek. The global k-means clustering algorithm. Pattern recognition, 36(2):451--461, 2003.Google ScholarCross Ref
- X. Lin, J. Wu, C. Zhou, S. Pan, Y. Cao, and B. Wang. Task-adaptive neural process for user cold-start recommendation. In The World Wide Web Conference, 2021.Google Scholar
- L. Liu, Z. Liu, H. Wu, Z. Wang, J. Shen, Y. Song, and M. Zhang. Multi-task learning via adaptation to similar tasks for mortality prediction of diverse rare diseases. In AMIA Annual Symposium Proceedings.Google Scholar
- C. Lu, C. K. Reddy, P. Chakraborty, S. Kleinberg, and Y. Ning. Collaborative graph learning with auxiliary text for temporal event prediction in healthcare. In IJCAI, 2021.Google ScholarCross Ref
- Q. Lu, T. H. Nguyen, and D. Dou. Predicting patient readmission risk from medical text via knowledge graph enhanced multiview graph convolution. In Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval, pages 1990--1994, 2021.Google ScholarDigital Library
- F. Ma, R. Chitta, J. Zhou, Q. You, T. Sun, and J. Gao. Dipole: Diagnosis prediction in healthcare via attention-based bidirectional recurrent neural networks. In Proceedings of the 23rd ACM SIGKDD international conference on knowledge discovery and data mining, pages 1903--1911, 2017.Google ScholarDigital Library
- F. Ma, Q. You, H. Xiao, R. Chitta, J. Zhou, and J. Gao. Kame: Knowledge-based attention model for diagnosis prediction in healthcare. In CIKM, 2018.Google ScholarDigital Library
- L. Ma, C. Zhang, Y. Wang, W. Ruan, J. Wang, W. Tang, X. Ma, X. Gao, and J. Gao. Concare: Personalized clinical feature embedding via capturing the healthcare context. In Proceedings of the AAAI Conference on Artificial Intelligence, 2020.Google ScholarCross Ref
- E. Mathieu and M. Nickel. Riemannian continuous normalizing flows. arXiv preprint arXiv:2006.10605, 2020.Google Scholar
- A. Obamuyide and A. Vlachos. Model-agnostic meta-learning for relation classification with limited supervision. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pages 5873--5879, 2019.Google ScholarCross Ref
- X. Peng, G. Long, T. Shen, S. Wang, Z. Niu, and C. Zhang. Mimo: Mutual integration of patient journey and medical ontology for healthcare representation learning. arXiv preprint arXiv:2107.09288, 2021.Google Scholar
- T. J. Pollard, A. E. Johnson, J. D. Raffa, L. A. Celi, R. G. Mark, and O. Badawi. The eicu collaborative research database, a freely available multi-center database for critical care research. Scientific data, 2018.Google Scholar
- Z. Qian, A. Alaa, A. Bellot, M. Schaar, and J. Rashbass. Learning dynamic and personalized comorbidity networks from event data using deep diffusion processes. In AISTATS, 2020.Google Scholar
- Z. Qian, W. R. Zame, M. van der Schaar, L. M. Fleuren, and P. Elbers. Integrating expert odes into neural odes: Pharmacology and disease progression. In Advances in Neural Information Processing Systems, 2021.Google Scholar
- Z. Qiao, Z. Zhang, X. Wu, S. Ge, and W. Fan. Mhm: Multi-modal clinical data based hierarchical multi-label diagnosis prediction. In Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval, pages 1841--1844, 2020.Google ScholarDigital Library
- H. Quan, V. Sundararajan, P. Halfon, A. Fong, B. Burnand, J.-C. Luthi, L. D. Saunders, C. A. Beck, T. E. Feasby, and W. A. Ghali. Coding algorithms for defining comorbidities in icd-9-cm and icd-10 administrative data. Medical care, pages 1130--1139, 2005.Google Scholar
- E. Rocheteau, P. Liò, and S. Hyland. Temporal pointwise convolutional networks for length of stay prediction in the intensive care unit. In Proceedings of the Conference on Health, Inference, and Learning, pages 58--68, 2021.Google ScholarDigital Library
- L. Song, C. W. Cheong, K. Yin, W. K. Cheung, B. C. Fung, and J. Poon. Medical concept embedding with multiple ontological representations. In IJCAI, 2019.Google ScholarCross Ref
- Q. Sun, Y. Liu, T.-S. Chua, and B. Schiele. Meta-transfer learning for few-shot learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2019.Google ScholarCross Ref
- Q. Suo, J. Chou, W. Zhong, and A. Zhang. Tadanet: Task-adaptive network for graph-enriched meta-learning. In Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pages 1789--1799, 2020.Google ScholarDigital Library
- X. Teng, S. Pei, and Y.-R. Lin. Stocast: Stochastic disease forecasting with progression uncertainty. IEEE Journal of Biomedical and Health Informatics, 2020.Google Scholar
- J. Whang, E. Lindgren, and A. Dimakis. Composing normalizing flows for inverse problems. In International Conference on Machine Learning, 2021.Google Scholar
- C. Xu, Y. Fu, C. Liu, C. Wang, J. Li, F. Huang, L. Zhang, and X. Xue. Learning dynamic alignment via meta-filter for few-shot learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2021.Google ScholarCross Ref
- G. Yang, X. Huang, Z. Hao, M.-Y. Liu, S. Belongie, and B. Hariharan. Pointflow: 3d point cloud generation with continuous normalizing flows. In Proceedings of the IEEE/CVF International Conference on Computer Vision, 2019.Google ScholarCross Ref
- M. Ye, S. Cui, Y. Wang, J. Luo, C. Xiao, and F. Ma. Medpath: Augmenting health risk prediction via medical knowledge paths. In The World Wide Web Conference, 2021.Google Scholar
- C. Yu, J. Han, H. Zhang, and W. Ng. Hypernymy detection for low-resource languages via meta learning. In ACL, 2020.Google ScholarCross Ref
- X. S. Zhang, F. Tang, H. H. Dodge, J. Zhou, and F. Wang. Metapred: Meta-learning for clinical risk prediction with limited patient electronic health records. In Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pages 2487--2495, 2019.Google ScholarDigital Library
- K. Zheng, W. Wang, J. Gao, K. Y. Ngiam, B. C. Ooi, and W. L. J. Yip. Capturing feature-level irregularity in disease progression modeling. In CIKM, 2017.Google ScholarDigital Library
- F. Zhou, L. Li, K. Zhang, G. Trajcevski, F. Yao, Y. Huang, T. Zhong, J. Wang, and Q. Liu. Forecasting the evolution of hydropower generation. In Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pages 2861--2870, 2020.Google ScholarDigital Library
- F. Zhou, Z. Wen, K. Zhang, G. Trajcevski, and T. Zhong. Variational session-based recommendation using normalizing flows. In The World Wide Web Conference, 2019.Google ScholarDigital Library
Index Terms
- MetaCare++: Meta-Learning with Hierarchical Subtyping for Cold-Start Diagnosis Prediction in Healthcare Data
Recommendations
SeqCare: Sequential Training with External Medical Knowledge Graph for Diagnosis Prediction in Healthcare Data
WWW '23: Proceedings of the ACM Web Conference 2023Deep learning techniques are capable of capturing complex input-output relationships, and have been widely applied to the diagnosis prediction task based on web-based patient electronic health records (EHR) data. To improve the prediction and ...
Modeling healthcare data using multiple-channel latent Dirichlet allocation
Display Omitted A novel topic model to handle diagnoses, medications, and contextual information.Discovery of patient groups and corresponding characteristics.Prediction of diagnoses giving medications and vice versa.Pairing of diagnoses and medications ...
Hierarchical Encoder-Decoder with Addressable Memory Network for Diagnosis Prediction
Database Systems for Advanced ApplicationsAbstractDeep learning methods have demonstrated success in diagnosis prediction on Electronic Health Records (EHRs). Early attempts utilize sequential models to encode patient historical records, but they lack the ability to identify critical diseases for ...
Comments