Abstract
High prevalence of chronic diseases along with poor health condition and the rising diagnosis and treatment costs necessitates concentration on prevention, early detection and disease management. In this paper correlation among the chronic diseases is examined with the help of diagnostic codes using unsupervised Machine Learning (ML) approaches. ML approaches pave the way to accomplish this objective. Healthcare data is categorized into clinical, Medi-claim, drugs and emergency information. In this work, Medi-claim data is used for exploring five types of chronic disorders such as Diabetes, Heart, Kidney, Liver and Cancer. Medi-claim data is acceptable because of its legitimacy, volume and demography qualities. Hierarchical Condition Category (HCC) and International Classification of Diseases (ICD) based coding of med-claim data are perfect with guaranteed informational index, this nature of ICD and HCC code urged us to work with Medi-claim records. The categorization of chronic and non-chronic diseases is built up through HCC codes utilizing different clustering techniques such as partitional, hierarchical and Fuzzy-K means clustering. The model is evaluated using various metrics such as Homogeneity, Completeness, V-measure, Adjusted Rand index, Adjusted Mutual Information. Among all the clustering techniques used K means and K means random have shown promising results. A compelling end on clustering of chronic diseases is made, remembering the clinical significance.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
Jensen, P.B., Jensen L.J., Brunak, S.: Mining electronic health records: towards better research applications and clinical care. Nat. Rev. Genet. 395–405 (2012)
Bali, A., Bali, D., Iyer, N., Iyer, M.: Management of medical records: facts and figures for surgeons. J. Maxillofacial Oral Surg. 10(3), 199–202 (2011)
https://medlineplus.gov/ency/patientinstructions/000602.html
Mohan Kumar, K.N., Sampath, S., Imran, M.: An overview on disease prediction for preventive care of health deterioration. IJEAT 8(5S) (2019)
Khalili-Damghani, K., Abdi, F., Abolmakarem, S.: Solving customer insurance coverage recommendation problem using a two-stage clustering-classification model. Int. J. Manage. Sci. Eng. Manage. (2019)
Gupta, D., Khare, S., Aggarwal, A.: A method to predict diagnostic codes for chronic diseases using machine learning techniques. In: The Proceedings of IEEE International Conference on Computing, Communication and Automation (2016)
Zhang, J., Chang, D.: Semi-supervised patient similarity clustering algorithm based on electronic medical records. In: IEEE Healthcare Information Technology for the Extreme and Remote Environments, pp. 90705–90714 (2019)
Ludwiga, B., König, D., Kapustac, N.D., Blümlc, V., Dorffner, G., Vyssokib, B.: Clustering suicides: a data-driven, exploratory Machine Learning approach. In: Journal of the European Psychiatric Association, pp. 15–19. Elsevier (2019)
Wang, K., Zhao, W., Cui, J., Cui, Y., Hu, J.: A K-anonymous clustering algorithm based on the analytic hierarchy process. In: Journal of Visual Communication and Image Representation, pp. 76–83. Elsevier (2019)
Tanga, C., Plasekd, J.M., Xionge, Y., Batesa, D.W., Zhoua, L.: Clustering similar clinical documents in electronic health records. In: Proceedings of the International Conference on Data Science. Elsevier (2018)
Yelipe, U., Porika, S., Golla, M.: An efficient approach for imputation and classification of medical data values using class-based clustering of medical records. J. Comput. Electr. Eng. 487–504 (2018)
Tiago Hillermana, João Carlos F. Souzaa, Ana Carla B. Reisa, Rommel, “Applying clustering and AHP methods for evaluating suspect healthcare claims”, Springer, 2017
Bai, Z.-J., Chan, R., Luk, F.: Principal component analysis for distributed data sets with updating. In: Proceedings of International workshop on Advanced Parallel Processing Technologies (APPT) (2005)
Qu, Y., Ostrouchovz, G., Samatovaz, N., Geıst, A.: Principal component analysis for dimension reduction in massive distributed data sets. In: Proceedings of IEEE International Conference on Data Mining (ICDM) (2002)
Gligorijevic, D., Stojanovic, J., Obradovic, Z.: Disease types discovery from a large database of inpatient records: a sepsis study. Expert Syst. Appl.111, 45–55 (2016)
Song, S., Warren, J., Riddle, P.: Developing high risk clusters for chronic disease events with classification association rule mining. In: The Proceedings of the Seventh Australasian Workshop on Health Informatics and Knowledge Management, vol. 153 (2014)
Jen, C.-H., Wangb, C.-C., Jiang, B.C., Chu, Y.-H., Chen, M.-S.: Application of classification techniques on development an early-warning system for chronic illnesses. Expert Syst. Appl. (2012)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Mohan Kumar, K.N., Sampath, S., Imran, M., Pradeep, N. (2021). Clustering Diagnostic Codes: Exploratory Machine Learning Approach for Preventive Care of Chronic Diseases. In: Satapathy, S., Zhang, YD., Bhateja, V., Majhi, R. (eds) Intelligent Data Engineering and Analytics. Advances in Intelligent Systems and Computing, vol 1177. Springer, Singapore. https://doi.org/10.1007/978-981-15-5679-1_53
Download citation
DOI: https://doi.org/10.1007/978-981-15-5679-1_53
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-15-5678-4
Online ISBN: 978-981-15-5679-1
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)