Abstract
Since it has been recognized that the disordered breathing during sleep is related to cardiovascular diseases, it is possible to predict cardiovascular diseases from sleep breathing data, which however is usually inevitable to have missing data, resulted probability from the loss to follow-up, failure to attend medical appointments, lack of measurements, failure to send or retrieve questionnaires, and inaccurate data transfer. In this paper, we propose a denoising autoencoder-based imputation (DAEimp) algorithm to impute the missing values in the sleep heart health study (SHHS) dataset for the predication of cardiovascular diseases. This algorithm consists of three major steps: (1) based on the missing completely at random assumption, the random uniform noise is added to the positions of missing values to convert missing data imputation into a denoising problem, (2) feed the noisy data and a missing position indicator matrix into an autoencoder model and use the reconstruction error, divided into observation positions reconstruction error and missing positions error, for denoising, and (3) the logistic regression is applied to the generated complete dataset for the identification of cardiovascular diseases. Our results on the SHHS dataset indicate that the proposed DAEimp algorithm achieves state-of-the-art performance in missing data imputation and sleep breathing data-based identification of cardiovascular diseases.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Koene, R.J., Prizment, A.E., Blaes, A., Konety, S.H.: Shared risk factors in cardiovascular disease and cancer. Circulation 133(11), 1104–1114 (2016)
Shahar, E., et al.: Sleep-disordered breathing and cardiovascular disease: cross-sectional results of the sleep heart health study. Am. J. Respir. Crit. Care Med. 163(1), 19–25 (2001)
Sleep Heart Health Study Homepage. https://sleepdata.org/datasets/shhs/
Dean, D.A., et al.: Scaling up scientific discovery in sleep medicine: the national sleep research resource. Sleep 39(5), 1151–1164 (2016)
Zhang, G.Q., et al.: The national sleep research resource: towards a sleep data commons. J. Am. Med. Inform. Assoc. 25(10), 1351–1358 (2018)
Quan, S.F., et al.: The sleep heart health study: design, rationale, and methods. Sleep 20(12), 1077–1085 (1997)
Redline, S., et al.: Methods for obtaining and analyzing unattended polysomnography data for a multicenter study. Sleep 21(7), 759–767 (1998)
Edward, J.C., Marius, R., Jules, H., Piergiorgio, S.: Longitudinal studies. J. Thorac. Dis. 7(11), 537–540 (2015)
Pedersen, A.B.: Missing data and multiple imputation in clinical epidemiological research. Clin. Epidemil. 9, 157–165 (2017)
Rubin, D.B.: Multiple imputation after 18+ years (with discussion). J. Am. Stat. Assoc. 91, 473–489 (1996)
Barnard, J., Meng, X.-L.: Applications of multiple imputation in medical studies: from AIDS to NHANES. Stat. Methods Med. Res. 8(1), 17–36 (1999)
Mackinnon, A.: The use and reporting of multiple imputation in medical research-a review. J. Intern. Med. 268(6), 586–593 (2010)
Sterne, J.A., et al.: Multiple imputation for missing data in epidemiological and clinical research: potential and pitfalls. BMJ 338, b2393 (2009)
Vincent, P., Larochelle, H., Bengio, Y., Manzagol, P.-A.: Extracting and composing robust features with denoising autoencoders. In: Proceedings of the 25th International Conference on Machine Learning, pp. 1096–1103. ACM, Finland (2008)
Gondara, L., Wang, K.: MIDA: multiple imputation using denoising autoencoders. In: Phung, D., Tseng, V.S., Webb, G.I., Ho, B., Ganji, M., Rashidi, L. (eds.) PAKDD 2018. LNCS (LNAI), vol. 10939, pp. 260–272. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-93040-4_21
Bengio, Y., Yao, L., Alain, G., Vincent, P.: Generalized denoising auto-encoders as generative models. In: Advances in Neural Information Processing Systems, Spain, pp. 899–907 (2013)
Yoon, J., Jordon, J, van der Schaar, M.: GAIN: missing data imputation using generative adversarial nets. In: 35th International Conference on Machine Learning, ICML 2018, pp. 9042–9051. ACM, Sweden (2018)
Buuren, S., Groothuis-Oudshoorn, K.: MICE: multivariate imputation by chained equations in R. J. Stat. Softw. 45(3), 1–67 (2011)
Song, Q., Shepperd, M., Chen, X., et al.: Can k-NN imputation improve the performance of C4.5 with small software project data sets? A comparative evaluation. J. Syst. Softw. 81(12), 2361–2370 (2008)
John, R., Mandl, L.A., Ghomrawi Hassan, M.K., et al.: Is there a role for expectation maximization imputation in addressing missing data in research using WOMAC questionnaire? Comparison to the standard mean approach and a tutorial. BMC Musculoskelet. Disord. 12(1), 109 (2011)
Acknowledgement
This work was supported in part by the Science and Technology Innovation Committee of Shenzhen Municipality, China, under Grants JCYJ20180306171334997, in part by the National Natural Science Foundation of China under Grants 61771397, in part by Synergy Innovation Foundation of the University and Enterprise for Graduate Students in Northwestern Polytechnical University (NPU) under Grants XQ201911, in part by the Seed Foundation of Innovation and Creation for Graduate Students in NPU under Grants ZZ2019029, and in part by the Project for Graduate Innovation team of NPU.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this paper
Cite this paper
Dong, X., Zhang, J., Wang, G., Xia, Y. (2019). DAEimp: Denoising Autoencoder-Based Imputation of Sleep Heart Health Study for Identification of Cardiovascular Diseases. In: Lin, Z., et al. Pattern Recognition and Computer Vision. PRCV 2019. Lecture Notes in Computer Science(), vol 11857. Springer, Cham. https://doi.org/10.1007/978-3-030-31654-9_44
Download citation
DOI: https://doi.org/10.1007/978-3-030-31654-9_44
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-31653-2
Online ISBN: 978-3-030-31654-9
eBook Packages: Computer ScienceComputer Science (R0)