Skip to main content

Advertisement

Log in

Optimized stacking ensemble models for the prediction of diabetic progression

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

The influence of applied machine learning in our day-to-day life has seen significant improvement over the last few years. The use of machine learning in Artificial Intelligence to predict various aspects of human life has helped industries in knowledge discovery, to draw inferences and to ultimately increase the business aspects. In healthcare industry, when different machines which monitor various health parameters are increasingly getting connected, it is important to process the information and draw inferences which could be very helpful and easy for the doctors to prescribe medicines and to give advice on lifestyle changes. In this paper, disease progression of Diabetes Mellitus of 442 patients is analyzed in terms of various health parameters along with six related blood serum measurements. Here, optimized stacking method is used to perform both regression and classification. In regression, the quantitative measurement of disease progression is predicted where as in classification, the disease progression is classified into high progression or low progression category. In both cases, certain base models are chosen and the accuracy score of these base models are compared with the score of optimized stacking based ensemble model.Optimized Stacking has shown promising results in comparison with the individual methods. The method is also tested on standard datasets. The result validation is performed using a large dataset with 22 features and 70,692 records, which is used to predict the diabetic information of patients. It was found that the technique has performed well with all the datasets.This method can be used as a data analysis backbone of healthcare based IoT systems for predicting diabetic progression as well as for any other related applications.

Optimized stacking in classification and regression

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Algorithm 1
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Algorithm 2
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16
Fig. 17
Fig. 18

Similar content being viewed by others

Data Availability

The authors hereby declare that the data used in this study is available in the public repository, the link for which is given in the manuscript.

Code Availability

The authors declare that as the study is part of Ph.D. work, the custom code is not made available in public.

Notes

  1. https://www.kaggle.com/datasets/uciml/pima-indians-diabetes-database

  2. https://archive.ics.uci.edu/ml/datasets/breast+cancer+wisconsin+(diagnostic)

  3. The dataset is taken from the following link https://www4.stat.ncsu.edu/~boos/var.select/diabetes.html, which is part of Sci-Kitlearn dataset library for machine learning

  4. https://www.kaggle.com/datasets/uciml/pima-indians-diabetes-database

  5. https://www.kaggle.com/code/alexteboul/diabetes-health-indicators-dataset-notebook/output?select=diabetes_binary_5050split_health_indicators_BRFSS2015.csv

References

  1. Abdollahi J, Nouri-Moghaddam B (2021) Hybrid stacked ensemble combined with genetic algorithms for Prediction of Diabetes arXiv:2103.08186

  2. Akula R, Ni N, Garibay I (2017) Supervised machine learning based ensemble model for accurate prediction of type 2 diabetes, for disease control national diabetes statistics report

  3. Alama F, Mehmoodb R, Katiba I, Albeshria A (2016) Analysis of eight data mining algorithms for smarter internet of things (IoT). In: International workshop on data mining in IoT systems, DaMIS

  4. Alehegn M, Joshi RR, Mulay P (2019) Diabetes analysis and prediction using random forest, KNN, Naïve Bayes, and J48: an ensemble approach. Int J Sci Technol Res 8(09). (issn 2277-8616 1346, ijstr)

  5. Ang Q, Liu Z, Wang W, Li K, Chen W-K (2010) Explored research on data preprocessing and mining technology for clinical data. 2nd IEEE international conference on information management and engineering

  6. Christoph F, Maier KW, Rink C (2020) A greedy stacking algorithm for model ensembling and domain weighting. BMC research notes

  7. Daliya VK, Ramesh TK, SEOK-BUM KO (2021) An optimised multivariable regression model for predictive analysis of diabetic disease progression. IEEE ACCESS

  8. Daliya VK, Ramesh TK, Shashikanth A (2020) A machine learning based ensemble approach for predictive analysis of healthcare data, 2nd PhD colloquium on ethically driven innovation and technology for society (PhD EDITS)

  9. Daskalaki E, Nørgaard K, Züger T, Prountzou A, Diem P, Mougiakakou S (2013) An early warning system for hypoglycemic/hyperglycemic events based on fusion of adaptive prediction models. J Diabetes Sci Technol 7(3):689–698. https://doi.org/10.1177/193229681300700314

    Article  Google Scholar 

  10. Dreiseitla S, Ohno-Machadob L (2002) Logistic regression and artificial neural network classification models: a methodology review. J Biomed Inf 35:352–359

    Article  Google Scholar 

  11. Efron B, Hastie T, Johnstone I, Tibshirani R (2004) Least angle regression. Annals Stat 32(2):407–451. https://doi.org/10.1214/009053604000000067c. Institute of mathematical statistics

    Article  MathSciNet  MATH  Google Scholar 

  12. Hamdi T, Ali JB, Costanzo VD, Fnaiech F, Moreau E, Ginoux J-M (2018) Accurate prediction of continuous blood glucose based on support vector regression and differential evolution algorithm. Biocybern Biomed Eng

  13. He Y, Ding Y, Liang B, Lin J, Kim T-K, Yu H, Hang H, Wang K (2017) A systematic study of dysregulated microrna in type 2 diabetes mellitus. Int J Mol Sci 18:456. https://doi.org/10.3390/ijms18030456

    Article  Google Scholar 

  14. Heureux AL’, Grolinger K, Elymany HF, Miriam AM (2017) Capretz :machine learning with big data:challenges and approaches. IEEE Access

  15. Hu X, Zhang H, Mei H, Xiao D, Li Y, Li M (2020) Landslide susceptibility mapping using the stacking ensemble machine learning method in Lushui Southwest China. Appl Sci 10(11):4016. https://doi.org/10.3390/app10114016

    Article  Google Scholar 

  16. Jangam E, Annavarapu CSR (2021) A stacked ensemble for the detection of COVID-19 with high recall and accuracy. Comput Biol Med 135:104608

    Article  Google Scholar 

  17. Kalaiyarasi P, Suguna J (2020) Prediction of diabetic disease using ensemble classifier. Int J Psych Rehab 24(7)

  18. Kumari S, Kumari D, Mitta M (2021) An ensemble approach for classification and prediction of diabetes mellitus using soft voting classifier. Int J Cognit Comput Eng 2:40–46. https://doi.org/10.1016/j.ijcce.2021.01.001

    Article  Google Scholar 

  19. Liu J, Wang L, Zhang L, Zhang Z, Zhang S (2020) Predictive analytics for blood glucose concentration: an empirical study using the tree-based ensemble approach. Library High Tech. https://doi.org/10.1108/lht-08-2019-0171

  20. Liu Y, Ye S, Xiao X, Sun C, Wang G, Wang G, Zhang B (2019) Machine learning for tuning, selection, and ensemble of multiple risk scores for predicting type 2 diabetes. Risk Manag Healthcare Policy 12:189–198. https://doi.org/10.2147/RMHP.S225762

    Article  Google Scholar 

  21. Mahdavinejad MS, Rezvan M, Barekatain M-M, Adibi P, Barnaghi P, Amit P (2018) Sheth machine learning for internet of things data analysis: a survey. Digital Commun Netw 4

  22. MolinRibeiro MHD, Coelho LS (2020) Ensemble approach based on bagging, boosting and stacking for short-term prediction in agribusiness time series. Appl Soft Comput 86:105837

    Article  Google Scholar 

  23. Nai-aruna N, Moungmaia R (2015) Comparison of classifiers for the risk of diabetes prediction, 7th international conference on advances in information technology

  24. Nai-aruna N, Sittidechb P (2014) Ensemble learning model for diabetes classification. Adv Mater Res

  25. Report of the expert committee on the diagnosis and classification of diabetes mellitus Medscape (2000) https://www.medscape.com/viewarticle/412642_4. Accessed on 1 Jul 2022

  26. Shailaja K, Seetharamulu B, Jabbar MA (2018) Machine learning in healthcare: a review. In: Proceedings of the 2nd international conference on electronics, communication and aerospace technology, ICECA

  27. Shanthamallu US, Spanias A, Tepedelenlioglu C, Stanley M (2017) A brief survey of machine learning methods and their sensor and IoT applications. In: 8th International conference on information intelligence systems and applications

  28. Singh N, Singh P (2020) Stacking-based multi-objective evolutionary ensemble framework for prediction of diabetes mellitus. Biocybern Biomed Eng 40:1–22

    Article  Google Scholar 

  29. Somannavar S, Ganesan A, Deepa M, Datta M, Mohan V (2009) Random capillary blood glucose cut points for diabetes and pre-diabetes derived from community-based opportunistic screening in India. Diabetes Care 32 (4):641–643. https://doi.org/10.2337/dc08-0403

    Article  Google Scholar 

  30. Susairaj P, Snehalatha C, Raghavan A, Nanditha A, Vinitha R, Satheesh K, Johnston DG, Ramachandran NJWA (2019) Cut-off Value of Random Blood Glucose among Asian Indians for Preliminary Screening of Persons with Prediabetes and Undetected Type 2 Diabetes Defined by the Glycosylated Haemoglobin Criteria. J Diabetes Clinical Res 1(2):53–58. https://doi.org/10.33696/diabetes.1.009

    Article  Google Scholar 

  31. Tama BA, Rhee K-H (2019) Tree-based classifier ensembles for early detection method of diabetes: an exploratory study. Artif Intell Rev 51:355–370

    Article  Google Scholar 

  32. Wang Y, Wu X, Mo X (2013) A novel adaptive-weighted-average framework for blood glucose prediction. Diabetes Technol Ther 15(10):792–801

    Article  Google Scholar 

  33. Woldaregaya AZ, Arsand E, Walderhaug S, Albers D, Mamykinad L, Botsise T, Hartvigsena G (2019) Data-driven modeling and prediction of blood glucose dynamics: Machine learning applications in type 1 diabetes. J Artif Intell Med

  34. Zhiqiang GE, Song Z, Ding SX, Huang B (2019) Data mining and analytics in the process industry: the role of machine learning. IEEE Access. https://doi.org/10.1109/ACCESS.2017.2756872

Download references

Author information

Authors and Affiliations

Authors

Contributions

We hereby declare that both the authors have contributed equally towards the work carried out in this paper.

Corresponding author

Correspondence to Daliya V. K..

Ethics declarations

Conflict of Interests/Competing Interests

The authors hereby declare that there is no conflict of interest/competing interest with regard to the article submitted.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

V. K., D., Ramesh, T.K. Optimized stacking ensemble models for the prediction of diabetic progression. Multimed Tools Appl 82, 42901–42925 (2023). https://doi.org/10.1007/s11042-023-14858-4

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-023-14858-4

Keywords

Navigation