Skip to main content

Comparative Analysis of Machine Learning Algorithms for Early Prediction of Diabetes Mellitus in Women

  • Conference paper
  • First Online:
Modelling and Implementation of Complex Systems (MISC 2020)

Abstract

Diabetes is a chronic disease characterized by hyperglycemia where a person suffers from a high level of blood sugar, which leads to complications such as blindness, cardiovascular diseases, and amputation. It is expected that in 2040 the diabetic patients will reach 642 million globally. Hence considering this alarming figure there is a strong need to early diagnose and predict the symptoms of diabetes to save precious human lives. One possible way to diagnose this disease is to leverage machine learning algorithms. Machine learning has swiftly been infiltrating in various domains in healthcare. With the help of diabetes data, machine learning algorithms can find hidden patterns to predict whether a patient is diabetic or non-diabetic. This research aims to provide a comparative analysis of the performance and effectiveness of selected machine learning algorithms in predicting diabetes in women. We develop a predication framework and implemented ten different machine learning algorithms, namely: Naive Bayes, BayesNet, Decision Tree, Random Forest, AdaBoost, Bagging, K-Nearest Neighbor, Support Vector Machine, Logistic Regression, and Multi-Layer Perceptron. Experimental results procured for the Frankfurt hospital (Germany) dataset shows that K-Nearest Neighbor, Random Forest, and Decision Tree outperformed the other algorithms in terms of all metrics. We believe that our diabetes prediction framework will assist doctors to predict diabetes mellitus with high accuracy.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 149.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 199.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Diabetes. https://www.who.int/health-topics/diabetes. Accessed 01 Mar 2020

  2. Alzaabi, A., Al-Kaabi, J., Al-Maskari, F., Farhood, A.F., Ahmed, L.A.: Prevalence of diabetes and cardio-metabolic risk factors in young men in the United Arab Emirates: a cross-sectional national survey. Endocrinol. Diabetes Metab. 2(4), e00081 (2019)

    Article  Google Scholar 

  3. Alawadi, F., Abusnana, S., Afandi, B., Aldahmani, K.M., Alhajeri, O., Aljaberi, K., Alkaabi, J., Almadani, A., Bashier, A., Beshyah, S., et al.: Emirates diabetes society consensus guidelines for the management of type 2 diabetes mellitus–2020. Dubai Diabetes Endocrinol. J. (2020)

    Google Scholar 

  4. Zou, Q., Qu, K., Luo, Y., Yin, D., Ju, Y., Tang, H.: Predicting diabetes mellitus with machine learning techniques. Front. Genet. 9, 515 (2018)

    Article  Google Scholar 

  5. Yuvaraj, N., SriPreethaa, K.: Diabetes prediction in healthcare systems using machine learning algorithms on Hadoop cluster. Cluster Comput. 22(1), 1–9 (2019)

    Article  Google Scholar 

  6. Kourou, K., Exarchos, T.P., Exarchos, K.P., Karamouzis, M.V., Fotiadis, D.I.: Machine learning applications in cancer prognosis and prediction. Comput. Struct. Biotechnol. J. 13, 8–17 (2015)

    Article  Google Scholar 

  7. Dinh, A., Miertschin, S., Young, A., Mohanty, S.D.: A data-driven approach to predicting diabetes and cardiovascular disease with machine learning. BMC Med. Inform. Decis. Making 19(1), 211 (2019)

    Article  Google Scholar 

  8. Younus, M., Munna, M.T.A., Alam, M.M., Allayear, S.M., Ara, S.J.F.: Prediction model for prevalence of type-2 diabetes mellitus complications using machine learning approach. In: Data Management and Analysis, pp. 103–116. Springer (2020)

    Google Scholar 

  9. Agarwal, A., Saxena, A.: Comparing machine learning algorithms to predict diabetes in women and visualize factors affecting it the most—A step toward better health care for women. In: International Conference on Innovative Computing and Communications, pp. 339–350. Springer (2020)

    Google Scholar 

  10. Du, F., Zhong, W., Wu, W., Peng, D., Xu, T., Wang, J., Wang, G., Hou, F.: Prediction of pregnancy diabetes based on machine learning. In: The 3rd International Conference on Biological Information and Biomedical Engineering, BIBE 2019, pp. 1–6. VDE (2019)

    Google Scholar 

  11. Sisodia, D., Sisodia, D.S.: Prediction of diabetes using classification algorithms. Proc. Comput. Sci. 132, 1578–1585 (2018)

    Article  Google Scholar 

  12. Mirza, S., Mittal, S., Zaman, M.: Decision support predictive model for prognosis of diabetes using smote and decision tree. Int. J. Appl. Eng. Res. 13(11), 9277–9282 (2018)

    Google Scholar 

  13. Zhang, Y., Lin, Z., Kang, Y., Ning, R., Meng, Y.: A feed-forward neural network model for the accurate prediction of diabetes mellitus. Int. J. Sci. Technol. Res. 7(8), 151–155 (2018)

    Google Scholar 

  14. Chawla, N.V., Bowyer, K.W., Hall, L.O., Kegelmeyer, W.P.: Smote: synthetic minority over-sampling technique. J. Artif. Intell. Res. 16, 321–357 (2002)

    Article  Google Scholar 

  15. Manimaran, R., Vanitha, M.: Prediction of diabetes disease using classification data mining techniques. Int. J. Eng. Technol. (IJET) (2017). ISSN (Print)

    Google Scholar 

  16. Aishwarya, R., Gayathri, P., et al.: A method for classification using machine learning technique for diabetes (2013)

    Google Scholar 

  17. Sowjanya, K., Singhal, A., Choudhary, C.: MobDBTest: a machine learning based system for predicting diabetes risk using mobile devices. In: 2015 IEEE International Advance Computing Conference (IACC), pp. 397–402. IEEE (2015)

    Google Scholar 

  18. Alaoui, S.S., Aksasse, B., Farhaoui, Y.: Data mining and machine learning approaches and technologies for diagnosing diabetes in women. In: International Conference on Big Data and Networks Technologies, pp. 59–72. Springer (2019)

    Google Scholar 

  19. Pima Indians Diabetes Database, Kaggle. https://www.kaggle.com/uciml/pima-indians-diabetes-database. Accessed 12 Mar 2020

  20. Heydari, M., Teimouri, M., Heshmati, Z., Alavinia, S.M.: Comparison of various classification algorithms in the diagnosis of type 2 diabetes in Iran. Int. J. Diabetes Dev. Countries 36(2), 167–173 (2016)

    Article  Google Scholar 

  21. Ayon, S.I., Islam, M., et al.: Diabetes prediction: a deep learning approach. Int. J. Inf. Eng. Electron. Bus. 11(2) (2019)

    Google Scholar 

  22. Chasan-Taber, L.: Lifestyle interventions to reduce risk of diabetes among women with prior gestational diabetes mellitus. Best Pract. Res. Clin. Obstet. Gynaecol. 29(1), 110–122 (2015)

    Article  Google Scholar 

  23. Diabetes, Kaggle. https://www.kaggle.com/johndasilva/diabetes. Accessed 04 Mar 2020

  24. Jayalakshmi, T., Santhakumaran, A.: Statistical normalization and back propagation for classification. Int. J. Comput. Theor. Eng. 3(1), 1793–8201 (2011)

    Google Scholar 

  25. Raj, R.S., Sanjay, D., Kusuma, M., Sampath, S.: Comparison of support vector machine and Naive Bayes classifiers for predicting diabetes. In: 2019 1st International Conference on Advanced Technologies in Intelligent Control, Environment, Computing & Communication Engineering (ICATIECE), pp. 41–45. IEEE (2019)

    Google Scholar 

  26. Zhang, B., Lu, L., Hou, J.: A comparison of logistic regression, random forest models in predicting the risk of diabetes. In: Proceedings of the 3rd International Symposium on Image Computing and Digital Medicine, pp. 231–234 (2019)

    Google Scholar 

  27. Parande, P.V., Banga, M.: Bagging for improving accuracy of diabetes classification. In: International Conference on Intelligent Computing and Communication, pp. 125–134. Springer (2019)

    Google Scholar 

  28. Zhu, C., Idemudia, C.U., Feng, W.: Improved logistic regression model for diabetes prediction by integrating PCA and K-means techniques. Inf. Med. Unlocked 17, 100179 (2019)

    Article  Google Scholar 

  29. Ahuja, R., Sharma, S.C., Ali, M.: A diabetic disease prediction model based on classification algorithms. Ann. Emerg. Technol. Comput. (AETiC) 3(3), 44–52 (2019)

    Article  Google Scholar 

  30. Luque, A., Carrasco, A., Martín, A., de las Heras, A.: The impact of class imbalance in classification performance metrics based on the binary confusion matrix. Pattern Recogn. 91, 216–231 (2019)

    Article  Google Scholar 

  31. Agrawal, R., Ghosh, S.P., Imielinski, T., Iyer, B.R., Swami, A.N.: An interval classifier for database mining applications. In: VLDB (1992)

    Google Scholar 

  32. Larabi-Marie-Sainte, S., Aburahmah, L., Almohaini, R., Saba, T.: Current techniques for diabetes prediction: review and case study. Appl. Sci. 9(21), 4604 (2019)

    Article  Google Scholar 

  33. Daanouni, O., Cherradi, B., Tmir, A.: Type 2 diabetes mellitus prediction model based on machine learning approach. In: The Proceedings of the 3rd International Conference on Smart City Applications, pp. 454–469. Springer (2019)

    Google Scholar 

  34. Khurana, G., Kumar, A.: Improving accuracy for diabetes mellitus prediction using data pre-processing and various new learning models (2019)

    Google Scholar 

  35. Farooqui, N., Mehra, R., Tyagi, A.: Prediction model for diabetes mellitus using machine learning techniques. Int. J. Comput. Sci. Eng. 6(3) (2018)

    Google Scholar 

Download references

Acknowledgement

This work was supported by the Roadway Transportation and Traffic Safety Research Center (RTTSRC) of the United Arab Emirates University (grant number 31R151).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Sumbal Malik .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2021 The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Malik, S., Harous, S., El-Sayed, H. (2021). Comparative Analysis of Machine Learning Algorithms for Early Prediction of Diabetes Mellitus in Women. In: Chikhi, S., Amine, A., Chaoui, A., Saidouni, D., Kholladi, M. (eds) Modelling and Implementation of Complex Systems. MISC 2020. Lecture Notes in Networks and Systems, vol 156. Springer, Cham. https://doi.org/10.1007/978-3-030-58861-8_7

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-58861-8_7

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-58860-1

  • Online ISBN: 978-3-030-58861-8

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics