To read this content please select one of the options below:

Early prediction of chronic disease using an efficient machine learning algorithm through adaptive probabilistic divergence based feature selection approach

Sandeepkumar Hegde (N.M.A.M. Institute of Technology, Nitte, India)
Monica R. Mundada (M.S. Ramaiah Institute of Technology, Bangalore, India)

International Journal of Pervasive Computing and Communications

ISSN: 1742-7371

Article publication date: 12 June 2020

Issue publication date: 19 February 2021

225

Abstract

Purpose

According to the World Health Organization, by 2025, the contribution of chronic disease is expected to rise by 73% compared to all deaths and it is considered as global burden of disease with a rate of 60%. These diseases persist for a longer duration of time, which are almost incurable and can only be controlled. Cardiovascular disease, chronic kidney disease (CKD) and diabetes mellitus are considered as three major chronic diseases that will increase the risk among the adults, as they get older. CKD is considered a major disease among all these chronic diseases, which will increase the risk among the adults as they get older. Overall 10% of the population of the world is affected by CKD and it is likely to double in the year 2030. The paper aims to propose novel feature selection approach in combination with the machine-learning algorithm which can early predict the chronic disease with utmost accuracy. Hence, a novel feature selection adaptive probabilistic divergence-based feature selection (APDFS) algorithm is proposed in combination with the hyper-parameterized logistic regression model (HLRM) for the early prediction of chronic disease.

Design/methodology/approach

A novel feature selection APDFS algorithm is proposed which explicitly handles the feature associated with the class label by relevance and redundancy analysis. The algorithm applies the statistical divergence-based information theory to identify the relationship between the distant features of the chronic disease data set. The data set required to experiment is obtained from several medical labs and hospitals in India. The HLRM is used as a machine-learning classifier. The predictive ability of the framework is compared with the various algorithm and also with the various chronic disease data set. The experimental result illustrates that the proposed framework is efficient and achieved competitive results compared to the existing work in most of the cases.

Findings

The performance of the proposed framework is validated by using the metric such as recall, precision, F1 measure and ROC. The predictive performance of the proposed framework is analyzed by passing the data set belongs to various chronic disease such as CKD, diabetes and heart disease. The diagnostic ability of the proposed approach is demonstrated by comparing its result with existing algorithms. The experimental figures illustrated that the proposed framework performed exceptionally well in prior prediction of CKD disease with an accuracy of 91.6.

Originality/value

The capability of the machine learning algorithms depends on feature selection (FS) algorithms in identifying the relevant traits from the data set, which impact the predictive result. It is considered as a process of choosing the relevant features from the data set by removing redundant and irrelevant features. Although there are many approaches that have been already proposed toward this objective, they are computationally complex because of the strategy of following a one-step scheme in selecting the features. In this paper, a novel feature selection APDFS algorithm is proposed which explicitly handles the feature associated with the class label by relevance and redundancy analysis. The proposed algorithm handles the process of feature selection in two separate indices. Hence, the computational complexity of the algorithm is reduced to O(nk+1). The algorithm applies the statistical divergence-based information theory to identify the relationship between the distant features of the chronic disease data set. The data set required to experiment is obtained from several medical labs and hospitals of karkala taluk ,India. The HLRM is used as a machine learning classifier. The predictive ability of the framework is compared with the various algorithm and also with the various chronic disease data set. The experimental result illustrates that the proposed framework is efficient and achieved competitive results are compared to the existing work in most of the cases.

Keywords

Citation

Hegde, S. and Mundada, M.R. (2021), "Early prediction of chronic disease using an efficient machine learning algorithm through adaptive probabilistic divergence based feature selection approach", International Journal of Pervasive Computing and Communications, Vol. 17 No. 1, pp. 20-36. https://doi.org/10.1108/IJPCC-04-2020-0018

Publisher

:

Emerald Publishing Limited

Copyright © 2020, Emerald Publishing Limited

Related articles