Abstract
Huge amounts of electronic health records (EHRs) accumulated in recent years have provided a rich foundation for disease risk prediction. However, the challenging problems of incompletion in raw data and interpretability of prediction model are not solved very well so far. In this study, we present a mimic learning approach for disease risk prediction with large ratio of missing values, called SR-DF, as one of the early attempts. Specifically, we adopt spectral regularization for incomplete medical data learning, on which the missingness among raw data can be more accurately measured and imputed. Moreover, by utilizing deep forest, we get an effective method that takes advantages of interpretable and reliable model for disease risk prediction, which requires far fewer parameters and is less sensitive to parameter settings. As we will report in the experiments, the proposed method outperforms the baselines and achieves relatively consistent and stable results.
This research is supported by Fundamental Research Funds for the Central Universities (Grant No. 2412017QD028), China Postdoctoral Science Foundation (Grant No. 2017M621192), the Scientific and Technological Development Program of Jilin Province (Grant Nos. 20180520022JH and 20190302109GX).
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Che, Z., Purushotham, S., Khemani, R., Liu, Y.: Distilling knowledge from deep networks with applications to healthcare domain. arXiv preprint arXiv:1512.03542 (2015)
Che, Z., Purushotham, S., Khemani, R., Liu, Y.: Interpretable deep models for ICU outcome prediction. In: AMIA Annual Symposium Proceedings 2016, p. 371. American Medical Informatics Association
Che, Z., Purushotham, S., Liu, Y.: Distilling knowledge from deep networks with applications to computational phenotyping. In: NSF Workshop on Data Science, Learning and Applications to Biomedical and Health Sciences (DSLA-BHS), pp. 1–6 (2016)
Chen, W., Wang, S., Long, G., Yao, L., Sheng, Q.Z., Li, X.: Dynamic illness severity prediction via multi-task RNNs for intensive care unit. In: 2018 IEEE International Conference on Data Mining (ICDM) 2018, pp. 917–922. IEEE
Chen, W., et al.: EEG-based motion intention recognition via multi-task RNNs. In: Proceedings of the 2018 SIAM International Conference on Data Mining 2018, pp. 279–287. SIAM (2018)
Mazumder, R., Hastie, T., Tibshirani, R.: Spectral regularization algorithms for learning large incomplete matrices. J. Mach. Learn. Res. 11, 2287–2322 (2010)
Zhou, Z.-H., Feng, J.: Deep forest: towards an alternative to deep neural networks. In: Proceedings of the 26th International Joint Conference on Artificial Intelligence 2017, pp. 3553–3559. AAAI Press (2017)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this paper
Cite this paper
Yue, L., Zhao, H., Yang, Y., Tian, D., Zhao, X., Yin, M. (2019). A Mimic Learning Method for Disease Risk Prediction with Incomplete Initial Data. In: Li, G., Yang, J., Gama, J., Natwichai, J., Tong, Y. (eds) Database Systems for Advanced Applications. DASFAA 2019. Lecture Notes in Computer Science(), vol 11448. Springer, Cham. https://doi.org/10.1007/978-3-030-18590-9_52
Download citation
DOI: https://doi.org/10.1007/978-3-030-18590-9_52
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-18589-3
Online ISBN: 978-3-030-18590-9
eBook Packages: Computer ScienceComputer Science (R0)