Skip to main content

Diabetes Prediction by Machine Learning Algorithms and Risks Factors

  • Conference paper
  • First Online:
Business Intelligence (CBI 2023)

Part of the book series: Lecture Notes in Business Information Processing ((LNBIP,volume 484 ))

Included in the following conference series:

  • 327 Accesses

Abstract

Diabetes is a chronic disease that can have a serious impact on one’s health; moreover, the risk of getting it can be decreased with early detection and care. For predicting diabetes, this study aims to compare the performance of six algorithms which are artificial neural networks (ANNs), decision tree (DT), support vector machine (SVM), K-Nearest Neighbors (K-NN), Naive Bayes (NB) and Random Forests models using common risk factors. These models are evaluated in terms of their accuracy, sensitivity, specificity, precision and F-measure. The algorithms were tested using three processes: three factors (glucose, BMI and age), five factors (glucose, BMI, age, insulin and skin) and for the last process all the patterns were used. The variables having the greatest impact on diabetic patients are identified from the association rules extracted, after the extraction of frequent variables by FP-Growth algorithm. By application of the algorithms mentioned above, the results showed that the random forest algorithm is considered as the best machine learning algorithm for the case of all factors but for the cases (3 factors) or (5 factors) Naive Bayes is better compared to the Random Forests algorithm.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Larabi-Marie-Sainte, S., Aburahmah, L., Almohaini, R., Saba, T.: Current techniques for diabetes prediction: review and case study. Appl. Sci. 9(21), 4604 (2019). https://doi.org/10.3390/app9214604

    Article  Google Scholar 

  2. Divya, K., Sirohi, A., Pande, S., Malik, R.: An IoMT assisted heart disease diagnostic system using machine learning techniques. In: Hassanien, A.E., Khamparia, A., Gupta, D., Shankar, K., Slowik, A., (eds.) Cognitive Internet of Medical Things for smart healthcare, pp. 145–161. Springer, New York (2021). https://doi.org/10.1007/978-3-030-55833-8_9

  3. Kumar, P.M., Devi, G.U.: A novel three-tier Internet of Things architecture with machine learning algorithm for early detection of heart diseases. Comput. Electr. Eng. 65, 222–235 (2018). https://doi.org/10.1016/j.compeleceng.2017.09.001

    Article  Google Scholar 

  4. Komi, M., Li, J., Zhai, Y., Zhang, X.:. Application of data mining methods in diabetes prediction. In: 2017 2nd International Conference on Image, Vision and Computing (ICIVC), Chengdu, China, pp. 1006–1010 (2017). https://doi.org/10.1109/ICIVC.2017.7984706

  5. Samant, P., Agarwal, R.: Machine learning techniques for medical diagnosis of diabetes using iris images. Comput. Methods Prog. Biomed. 157, 121–128 (2018). https://doi.org/10.1016/j.cmpb.2018.01.004

    Article  Google Scholar 

  6. Samant, P., Agarwal, R.: Comparative analysis of classification based algorithms for diabetes diagnosis using iris images. J. Med. Eng. Technol. 42, 35–42 (2018). https://doi.org/10.1080/03091902.2017.1412521

    Article  Google Scholar 

  7. You, J., van der Klein, S.A.S., Lou, E., Zuidhof, M.J.: Application of random forest classification to predict daily oviposition events in broiler breeders fed by precision feeding system. Comput. Electron. Agric. 175, 105526 (2020). https://doi.org/10.1016/j.compag.2020.105526

    Article  Google Scholar 

  8. Burdi, F., Setianingrum, A.H., Hakiem, N.: Application of the Naive Bayes method to a decision support system to provide discounts (case study: PT. Bina Usaha Teknik). In: 2016 6th International Conference on Information and Communication Technology for The Muslim World (ICT4M). Jakarta, pp. 281–285 (2016). https://doi.org/10.1109/ICT4M.2016.064

  9. Akbar, R., Nasution, S.M., Prasasti, A.L.: Implementation of Naive Bayes algorithm on IoT-based smart laundry mobile application system. In: 2020 international conference on information technology systems and innovation (ICITSI). Bandung - Padang, Indonesia, pp. 8–13 (2020). https://doi.org/10.1109/ICITSI50517.2020.9264938

  10. Pandiangan, N., Buono, M.L.C., Loppies, S.H.D.: Implementation of decision tree and Naïve Bayes classification method for predicting study period. J. Phys. Conf. Ser. 1569, 022022 (2020). https://doi.org/10.1088/1742-6596/1569/2/022022

    Article  Google Scholar 

  11. Gomathi, S., Narayani, V.: Monitoring of lupus disease using decision tree induction classification algorithm. In: 2015 International Conference on Advanced Computing and Communication Systems. Coimbatore, India, pp. 1–6 (2015). https://doi.org/10.1109/ICACCS.2015.7324054

  12. Abdar, M., Nasarian, E., Zhou, X., Bargshady, G., Wijayaningrum, V.N., Hussain, S.: Performance improvement of decision trees for diagnosis of coronary artery disease using multi filtering approach. In: 2019 IEEE 4th International Conference on Computer and Communication Systems (ICCCS). Singapore, pp. 26–30 (2019). https://doi.org/10.1109/CCOMS.2019.8821633

  13. Premamayudu, B., et al.: Diabetes prediction using machine learning KNN -algorithm technique. Int. J. Innovative Science Res. Technol. 7(5) (2022)

    Google Scholar 

  14. Jadhav, S.D., Channe, H.P.: Comparative study of K-NN, naive bayes and decision tree classification techniques. Int. J. Sci. Res. 5(1), 1842–1845 (2016)

    Google Scholar 

  15. Wu, X., Wang, S., Zhang, Y.: Review of K nearest neighbor algorithm theory and application. Comput. Eng. Appl. 53(21), 1–7 (2017)

    Google Scholar 

  16. Kuswanto, H., Mubarok, R.: Classification of cancer drug compounds for radiation protection optimization using CART. In : The Fifth Information Systems International Conference (2019)

    Google Scholar 

  17. Shirole, U., Joshi, M., Bagul, P. : Cardiac, diabetic and normal subjects classification using decision tree and result confirmation through orthostatic stress index. Informatics in Medicine Unlocked 17, 100252 (2019)

    Google Scholar 

  18. Xu, W., Jiang, L.: An attribute value frequency-based instance weighting filter for naive Bayes. J. Exp. Theor. Artif. Intell. 31(4), 225–236 (2019)

    Article  Google Scholar 

  19. Svetnik, V., Liaw, A., Tong, C., Culberson, J.C., Sheridan, R.F., Feuston, B.P.: Random forest: a classification and regression tool for compound classification and QSAR modeling. J. Chem. Inf. Comput. Sci. 43(6), 1947–1958 (2003)

    Article  Google Scholar 

  20. Matsumoto, A., Aoki, S., Ohwada, H.: Comparison of random forest and SVM for raw data in drug discovery: prediction of radiation protection and toxicity case study. Int. J. Machine Learning Comput. 6(2), 145–148 (2016)

    Article  Google Scholar 

  21. Zekić-Sušaca, M., Hasa, A., Knežev, M.: Predicting energy cost of public buildings by artificial neural networks, CART, and random forest Forest. Neurocomputing 439, 223-233 (2021)

    Google Scholar 

  22. Butwall, M., Kumar, S. : A data mining approach for the diagnosis of diabetes mellitus using random forest classifier. Int. J. Computer Appl. 120(8) (2015)

    Google Scholar 

  23. Kuswanto, H., Mubarok, R., Ohwada, H.: Classification using naive bayes to predict radiation protection in cancer drug discovery: a case of mixture based grouped data. Int. J. Artificial Intell. 17(1), 186–203 (2019)

    Google Scholar 

  24. Wadiai, Y., Baslam, M.: Machine learning approach to automate decision support on information system attacks. Lecture Notes in Business Information Processing ISBN 978–3–031–06457–9 ISBN 978–3–031–06458–6 (eBook) https://doi.org/10.1007/978-3-031-06458-6

  25. Fakir, Y., Maarouf, A., El Ayachi, R.: Mining frequents itemset and association rules in diabetic dataset. Lecture Notes in Business Information Processing ISBN 978–3–031–06457–9 ISBN 978–3–031–06458–6 (eBook) https://doi.org/10.1007/978-3-031-06458-6

  26. Bair, E., Hastie, T., Paul, D., Tibshirani, R. : Prediction by supervised principal components. J. American Statistical Assoc. 101(473), 119–137 (2006)

    Google Scholar 

  27. Borges, V.R.P., Esteves, S.L., De Nardi Araujo, P., Oliveira, L.C., Holanda, M. : Using Principal Component Analysis to support students’ performance prediction and data analysis, VII Congresso Brasileiro de Informática na Educação (CBIE 2018), Anais do XXIX Simpósio Brasileiro de Informática na Educação (SBIE 2018)

    Google Scholar 

  28. Fakir, Y., Abdelmotalib, N. : Analysis of decision tree algorithms for diabetes prediction. Lecture Notes in Business Information Processing ISBN 978–3–031–06457–9 ISBN 978–3–031–06458–6 (eBook) https://doi.org/10.1007/978-3-031-06458-6

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Youssef Fakir .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Fakir, Y. (2023). Diabetes Prediction by Machine Learning Algorithms and Risks Factors. In: El Ayachi, R., Fakir, M., Baslam, M. (eds) Business Intelligence. CBI 2023. Lecture Notes in Business Information Processing, vol 484 . Springer, Cham. https://doi.org/10.1007/978-3-031-37872-0_4

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-37872-0_4

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-37871-3

  • Online ISBN: 978-3-031-37872-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics