Abstract
Diabetes is a chronic disease characterized by hyperglycemia where a person suffers from a high level of blood sugar, which leads to complications such as blindness, cardiovascular diseases, and amputation. It is expected that in 2040 the diabetic patients will reach 642 million globally. Hence considering this alarming figure there is a strong need to early diagnose and predict the symptoms of diabetes to save precious human lives. One possible way to diagnose this disease is to leverage machine learning algorithms. Machine learning has swiftly been infiltrating in various domains in healthcare. With the help of diabetes data, machine learning algorithms can find hidden patterns to predict whether a patient is diabetic or non-diabetic. This research aims to provide a comparative analysis of the performance and effectiveness of selected machine learning algorithms in predicting diabetes in women. We develop a predication framework and implemented ten different machine learning algorithms, namely: Naive Bayes, BayesNet, Decision Tree, Random Forest, AdaBoost, Bagging, K-Nearest Neighbor, Support Vector Machine, Logistic Regression, and Multi-Layer Perceptron. Experimental results procured for the Frankfurt hospital (Germany) dataset shows that K-Nearest Neighbor, Random Forest, and Decision Tree outperformed the other algorithms in terms of all metrics. We believe that our diabetes prediction framework will assist doctors to predict diabetes mellitus with high accuracy.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Diabetes. https://www.who.int/health-topics/diabetes. Accessed 01 Mar 2020
Alzaabi, A., Al-Kaabi, J., Al-Maskari, F., Farhood, A.F., Ahmed, L.A.: Prevalence of diabetes and cardio-metabolic risk factors in young men in the United Arab Emirates: a cross-sectional national survey. Endocrinol. Diabetes Metab. 2(4), e00081 (2019)
Alawadi, F., Abusnana, S., Afandi, B., Aldahmani, K.M., Alhajeri, O., Aljaberi, K., Alkaabi, J., Almadani, A., Bashier, A., Beshyah, S., et al.: Emirates diabetes society consensus guidelines for the management of type 2 diabetes mellitus–2020. Dubai Diabetes Endocrinol. J. (2020)
Zou, Q., Qu, K., Luo, Y., Yin, D., Ju, Y., Tang, H.: Predicting diabetes mellitus with machine learning techniques. Front. Genet. 9, 515 (2018)
Yuvaraj, N., SriPreethaa, K.: Diabetes prediction in healthcare systems using machine learning algorithms on Hadoop cluster. Cluster Comput. 22(1), 1–9 (2019)
Kourou, K., Exarchos, T.P., Exarchos, K.P., Karamouzis, M.V., Fotiadis, D.I.: Machine learning applications in cancer prognosis and prediction. Comput. Struct. Biotechnol. J. 13, 8–17 (2015)
Dinh, A., Miertschin, S., Young, A., Mohanty, S.D.: A data-driven approach to predicting diabetes and cardiovascular disease with machine learning. BMC Med. Inform. Decis. Making 19(1), 211 (2019)
Younus, M., Munna, M.T.A., Alam, M.M., Allayear, S.M., Ara, S.J.F.: Prediction model for prevalence of type-2 diabetes mellitus complications using machine learning approach. In: Data Management and Analysis, pp. 103–116. Springer (2020)
Agarwal, A., Saxena, A.: Comparing machine learning algorithms to predict diabetes in women and visualize factors affecting it the most—A step toward better health care for women. In: International Conference on Innovative Computing and Communications, pp. 339–350. Springer (2020)
Du, F., Zhong, W., Wu, W., Peng, D., Xu, T., Wang, J., Wang, G., Hou, F.: Prediction of pregnancy diabetes based on machine learning. In: The 3rd International Conference on Biological Information and Biomedical Engineering, BIBE 2019, pp. 1–6. VDE (2019)
Sisodia, D., Sisodia, D.S.: Prediction of diabetes using classification algorithms. Proc. Comput. Sci. 132, 1578–1585 (2018)
Mirza, S., Mittal, S., Zaman, M.: Decision support predictive model for prognosis of diabetes using smote and decision tree. Int. J. Appl. Eng. Res. 13(11), 9277–9282 (2018)
Zhang, Y., Lin, Z., Kang, Y., Ning, R., Meng, Y.: A feed-forward neural network model for the accurate prediction of diabetes mellitus. Int. J. Sci. Technol. Res. 7(8), 151–155 (2018)
Chawla, N.V., Bowyer, K.W., Hall, L.O., Kegelmeyer, W.P.: Smote: synthetic minority over-sampling technique. J. Artif. Intell. Res. 16, 321–357 (2002)
Manimaran, R., Vanitha, M.: Prediction of diabetes disease using classification data mining techniques. Int. J. Eng. Technol. (IJET) (2017). ISSN (Print)
Aishwarya, R., Gayathri, P., et al.: A method for classification using machine learning technique for diabetes (2013)
Sowjanya, K., Singhal, A., Choudhary, C.: MobDBTest: a machine learning based system for predicting diabetes risk using mobile devices. In: 2015 IEEE International Advance Computing Conference (IACC), pp. 397–402. IEEE (2015)
Alaoui, S.S., Aksasse, B., Farhaoui, Y.: Data mining and machine learning approaches and technologies for diagnosing diabetes in women. In: International Conference on Big Data and Networks Technologies, pp. 59–72. Springer (2019)
Pima Indians Diabetes Database, Kaggle. https://www.kaggle.com/uciml/pima-indians-diabetes-database. Accessed 12 Mar 2020
Heydari, M., Teimouri, M., Heshmati, Z., Alavinia, S.M.: Comparison of various classification algorithms in the diagnosis of type 2 diabetes in Iran. Int. J. Diabetes Dev. Countries 36(2), 167–173 (2016)
Ayon, S.I., Islam, M., et al.: Diabetes prediction: a deep learning approach. Int. J. Inf. Eng. Electron. Bus. 11(2) (2019)
Chasan-Taber, L.: Lifestyle interventions to reduce risk of diabetes among women with prior gestational diabetes mellitus. Best Pract. Res. Clin. Obstet. Gynaecol. 29(1), 110–122 (2015)
Diabetes, Kaggle. https://www.kaggle.com/johndasilva/diabetes. Accessed 04 Mar 2020
Jayalakshmi, T., Santhakumaran, A.: Statistical normalization and back propagation for classification. Int. J. Comput. Theor. Eng. 3(1), 1793–8201 (2011)
Raj, R.S., Sanjay, D., Kusuma, M., Sampath, S.: Comparison of support vector machine and Naive Bayes classifiers for predicting diabetes. In: 2019 1st International Conference on Advanced Technologies in Intelligent Control, Environment, Computing & Communication Engineering (ICATIECE), pp. 41–45. IEEE (2019)
Zhang, B., Lu, L., Hou, J.: A comparison of logistic regression, random forest models in predicting the risk of diabetes. In: Proceedings of the 3rd International Symposium on Image Computing and Digital Medicine, pp. 231–234 (2019)
Parande, P.V., Banga, M.: Bagging for improving accuracy of diabetes classification. In: International Conference on Intelligent Computing and Communication, pp. 125–134. Springer (2019)
Zhu, C., Idemudia, C.U., Feng, W.: Improved logistic regression model for diabetes prediction by integrating PCA and K-means techniques. Inf. Med. Unlocked 17, 100179 (2019)
Ahuja, R., Sharma, S.C., Ali, M.: A diabetic disease prediction model based on classification algorithms. Ann. Emerg. Technol. Comput. (AETiC) 3(3), 44–52 (2019)
Luque, A., Carrasco, A., MartÃn, A., de las Heras, A.: The impact of class imbalance in classification performance metrics based on the binary confusion matrix. Pattern Recogn. 91, 216–231 (2019)
Agrawal, R., Ghosh, S.P., Imielinski, T., Iyer, B.R., Swami, A.N.: An interval classifier for database mining applications. In: VLDB (1992)
Larabi-Marie-Sainte, S., Aburahmah, L., Almohaini, R., Saba, T.: Current techniques for diabetes prediction: review and case study. Appl. Sci. 9(21), 4604 (2019)
Daanouni, O., Cherradi, B., Tmir, A.: Type 2 diabetes mellitus prediction model based on machine learning approach. In: The Proceedings of the 3rd International Conference on Smart City Applications, pp. 454–469. Springer (2019)
Khurana, G., Kumar, A.: Improving accuracy for diabetes mellitus prediction using data pre-processing and various new learning models (2019)
Farooqui, N., Mehra, R., Tyagi, A.: Prediction model for diabetes mellitus using machine learning techniques. Int. J. Comput. Sci. Eng. 6(3) (2018)
Acknowledgement
This work was supported by the Roadway Transportation and Traffic Safety Research Center (RTTSRC) of the United Arab Emirates University (grant number 31R151).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Malik, S., Harous, S., El-Sayed, H. (2021). Comparative Analysis of Machine Learning Algorithms for Early Prediction of Diabetes Mellitus in Women. In: Chikhi, S., Amine, A., Chaoui, A., Saidouni, D., Kholladi, M. (eds) Modelling and Implementation of Complex Systems. MISC 2020. Lecture Notes in Networks and Systems, vol 156. Springer, Cham. https://doi.org/10.1007/978-3-030-58861-8_7
Download citation
DOI: https://doi.org/10.1007/978-3-030-58861-8_7
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-58860-1
Online ISBN: 978-3-030-58861-8
eBook Packages: EngineeringEngineering (R0)