Abstract
Energy produces by blood sugar in human body. Diabetes mellitus is a condition in which human body cannot manage energy. Blood glucose, Insulin etc., and their functions become unmanaged in the glucose energy system in blood. This unbalanced system generates many dangerous diseases as like blood pressure, diabetes etc., in body. Many different recourses of energy are available in the nature. In this paper we present dataset with their pattern by box whisker plot, histograms and extract nine best features by Chi-Square and plot correlation matrix for correlation of each feature with their heat_map. We have applied rule-based classification algorithms as Decision table, OneR and JRIP on prepared dataset. In proposed work we have managed these three selected algorithms by Bagging and Boosting ensemble methods. These ensemble methods calculate Classification Accuracy, Precision, Recall and F1-Score for diabetes UCI dataset. The summary with the results, finally we concluded that the highest accuracy (98%) of Bagging Ensemble Method and the accuracy, precision, recall and f1-score of the Decision Table, OneR, Jrip and Boosting algorithms were carrying less.
Similar content being viewed by others
References
Henkel R. Infection in infertility. In: Male infertility, Cham: Springer; 2020. pp. 409–424. https://doi.org/10.1007/978-1-4614-3335-4_25
Maniruzzaman M, Rahman MJ, Ahammed B, Abedin MM. Classification and prediction of diabetes disease using machine learning paradigm. Health Inform SciSyst. 2020;8(1):7.
Chatrati SP, Hossain G, Goyal A, Bhan A, Bhattacharya S, Gaurav D, Tiwari SM. Smart home health monitoring system for predicting type 2 diabetes and hypertension. J King Saud Univ-Comput Inform Sci. 2020. https://doi.org/10.1016/j.jksuci.2020.01.010.
Younus M, Munna MTA, Alam MM, Allayear SM, Ara SJF (2020) Prediction Model for Prevalence of Type-2 Diabetes Mellitus Complications Using Machine Learning Approach. In: Alhajj R, Moshirpour M, Far B (eds) Data Management and Analysis. Studies in Big Data, vol 65. Springer, Cham. https://doi.org/10.1007/978-3-030-32587-9_7.
Shuja M, Mittal S, Zaman M. Effective prediction of type II diabetes mellitus using data mining classifiers and SMOTE. In Advances in Computing and Intelligent Systems. Singapore: Springer, 2020. pp. 195–211. https://doi.org/10.1007/978-981-15-0222-4_17.
Wang J, Wang MY, Wang H, Liu HW, Lu R, Duan TQ, Li CP, Cui Z, Liu YY, Lyu YJ, Ma J. Status of glycosylated hemoglobin and prediction of glycemic control among patients with insulin-treated type 2 diabetes in North China: a multicenter observational study. Chinese Med J. 2020;133(1):17–24.
Goyal M, Reeves ND, Rajbhandari S, Ahmad N, Wang C, Yap MH. Recognition of ischaemia and infection in diabetic foot ulcers: dataset and techniques. Comput Biol Med. 2020;10:103616.
Ryu KS, Lee SW, Batbaatar E, Lee JW, Choi KS, Cha HS. A deep learning model for estimation of patients with undiagnosed diabetes. ApplSci. 2020;10(1):421.
Faruque MF, Sarker IH. Performance analysis of machine learning techniques to predict diabetes mellitus. In 2019 International Conference on Electrical, Computer and Communication Engineering (ECCE) 2019 (pp. 1–4). IEEE. https://doi.org/10.1109/ECACE.2019.8679365.
Kavakiotis I, Tsave O, Salifoglou A, Maglaveras N, Vlahavas I, Chouvarda I. Machine learning and data mining methods in diabetes research. Comput Struct Biotechnol J. 2017;1(15):104–16.
Dagliati A, Marini S, Sacchi L, Cogni G, Teliti M, Tibollo V, De Cata P, Chiovato L, Bellazzi R. Machine learning methods to predict diabetes complications. J Diabet Sci Technol. 2018;12(2):295–302.
Norris DJ. Introduction to machine learning (ML) with the Raspberry Pi (RasPi). InMachine Learning with the Raspberry Pi. Berkeley: Apress; 2020. pp. 1–47. https://doi.org/10.1007/978-1-4842-5174-4_1.
Chiu MK, Cleve J, Klost K, Korman M, Mulzer W, van Renssen A, Roeloffzen M, Willert M. Routing in histograms. In International Workshop on Algorithms and Computation. Cham: Springer; 2020. pp. 43–54. https://doi.org/10.1007/978-3-030-39881-1_5.
Bornmann L, Wray KB, Haunschild R. Citation concept analysis (CCA): a new form of citation analysis revealing the usefulness of concepts for other researchers illustrated by exemplary case studies including classic books by Thomas S. Kuhn and Karl R. Popper. Scientometrics. 2020;122(2):1051–74.
Nilsen P. Making sense of implementation theories, models, and frameworks. In Implementation Science 3.0. Cham: Springer; 2020. pp. 53–79. https://doi.org/10.1007/978-3-030-03874-8_3.
Tanwar N, Singh A, Singh R. A support vector machine based approach for effective fault localization. In: Soft Computing: Theories and Applications. Singapore: Springer; 2020. pp. 825–835. https://doi.org/10.1007/978-981-15-0751-9_75.
Ahmim A, Ferrag MA, Maglaras L, Derdour M, Janicke H. A detailed analysis of using supervised machine learning for intrusion detection. In: Strategic Innovative Marketing and Tourism. Cham: Springer; 2020. pp. 629–639. https://doi.org/10.1007/978-3-030-36126-6_70.
Nagy G. Green information extraction from family books. SN ComputSci. 2020;1(1):23.
Wei X, Fan X. The feasibility analysis of the application of ensemble learning to operational assistant decision-making. In Artificial Intelligence in China. Singapore: Springer; 2020. pp. 289–297.
Patil PR, Sivagami M. Forest cover classification using stacking of ensemble learning and neural networks. In Artificial Intelligence and Evolutionary Computations in Engineering Systems. Singapore: Springer; 2020. pp. 89–102. https://doi.org/10.1007/978-981-15-0199-9_8
Tripathi AK, Garg P, Tripathy A, Vats N, Gupta D, Khanna A. Application of chicken swarm optimization in detection of cancer and virtual reality. In Advanced Computational Intelligence Techniques for Virtual Reality in Healthcare. Cham: Springer; 2020. pp. 165–192. https://doi.org/10.1007/978-3-030-35252-3_9
Varma KM, Panda BS. Comparative analysis of predicting diabetes using machine learning techniques. 2019;6(6):522–30.
Zhu C, Idemudia CU, Feng W. Improved logistic regression model for diabetes prediction by integrating PCA and K-means techniques. Inform Med Unlocked. 2019;1(17):100179.
Aada MTSA, Tiwari Sakshi. Predicting diabetes in medical datasets using machine learning techniques. Int J Scientific Eng Res. 2017;8(5):1538–51.
Saru S, Subashree S. Analysis and prediction of diabetes using machine learning. Int J Emerging Technol Innovative Eng. 2019;5(4):167–75.
Sengamuthu R, Abirami R, Karthik D. Various data mining techniques analysis to predict diabetes mellitus. Int Res J Eng Technol (IRJET)5.5 (2018): 676–79.
Wu H, Yang S, Huang Z, He J, Wang X. Type 2 diabetes mellitus prediction model based on data mining. Inform Med Unlocked. 2018;1(10):100–7.
Kaur H, Kumari V. Predictive modelling and analytics for diabetes using a machine learning approach. Appl Comput Inform. 2018. https://doi.org/10.1016/j.aci.2018.12.004.
Sisodia D, Sisodia DS. Prediction of diabetes using classification algorithms. Proc Comp Sci. 2018;1(132):1578–85.
Yadav DC, Pal S. Prediction of thyroid disease using decision tree ensemble method. Human-Intelligent Sys Int. 2020;2(1):89–95.
Yadav DC, Pal S. Prediction of heart disease using feature selection and random forest ensemble method. Int J Pharmaceutical Res. 2020;12(4):56–66.
Yadav DC, Pal S. Thyroid prediction using ensemble data mining techniques. Int J Inform Tech. 2019. https://doi.org/10.1007/s41870-019-00395-7.
Yadav DC, Pal S. Calculating diagnose odd ratio for thyroid patients using different data mining classifiers and ensemble techniques. Int J Adv Trends Comput Sci Eng. 2019;9(4):5463–70.
Yadav DC, Pal S. Discovery of hidden pattern in thyroid disease by machine learning algorithms. Indian J Public Health Res Dev. 2020;11(1):61–6.
Acknowledgements
The author is grateful to Veer Bahadur Singh Purvanchal University Jaunpur, Uttar Pradesh, for Providing financial support to work as Post Doctoral Research Fellowship.
Funding
This study was not funded.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of Interest
The authors declare that they have no conflict of interest.
Ethical Approval
This article does not contain any studies with human participants or animals performed by any of the authors.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
This article is part of the topical collection “Computational Statistics” guest edited by Anish Gupta, Mike Hinchey, Vincenzo Puri, Zeev Zalevsky and Wan Abdul Rahim.
Rights and permissions
About this article
Cite this article
Yadav, D.C., Pal, S. An Experimental Study of Diversity of Diabetes Disease Features by Bagging and Boosting Ensemble Method with Rule Based Machine Learning Classifier Algorithms. SN COMPUT. SCI. 2, 50 (2021). https://doi.org/10.1007/s42979-020-00446-y
Received:
Accepted:
Published:
DOI: https://doi.org/10.1007/s42979-020-00446-y