Abstract
Diabetes is a chronic disease that can have a serious impact on one’s health; moreover, the risk of getting it can be decreased with early detection and care. For predicting diabetes, this study aims to compare the performance of six algorithms which are artificial neural networks (ANNs), decision tree (DT), support vector machine (SVM), K-Nearest Neighbors (K-NN), Naive Bayes (NB) and Random Forests models using common risk factors. These models are evaluated in terms of their accuracy, sensitivity, specificity, precision and F-measure. The algorithms were tested using three processes: three factors (glucose, BMI and age), five factors (glucose, BMI, age, insulin and skin) and for the last process all the patterns were used. The variables having the greatest impact on diabetic patients are identified from the association rules extracted, after the extraction of frequent variables by FP-Growth algorithm. By application of the algorithms mentioned above, the results showed that the random forest algorithm is considered as the best machine learning algorithm for the case of all factors but for the cases (3 factors) or (5 factors) Naive Bayes is better compared to the Random Forests algorithm.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Larabi-Marie-Sainte, S., Aburahmah, L., Almohaini, R., Saba, T.: Current techniques for diabetes prediction: review and case study. Appl. Sci. 9(21), 4604 (2019). https://doi.org/10.3390/app9214604
Divya, K., Sirohi, A., Pande, S., Malik, R.: An IoMT assisted heart disease diagnostic system using machine learning techniques. In: Hassanien, A.E., Khamparia, A., Gupta, D., Shankar, K., Slowik, A., (eds.) Cognitive Internet of Medical Things for smart healthcare, pp. 145–161. Springer, New York (2021). https://doi.org/10.1007/978-3-030-55833-8_9
Kumar, P.M., Devi, G.U.: A novel three-tier Internet of Things architecture with machine learning algorithm for early detection of heart diseases. Comput. Electr. Eng. 65, 222–235 (2018). https://doi.org/10.1016/j.compeleceng.2017.09.001
Komi, M., Li, J., Zhai, Y., Zhang, X.:. Application of data mining methods in diabetes prediction. In: 2017 2nd International Conference on Image, Vision and Computing (ICIVC), Chengdu, China, pp. 1006–1010 (2017). https://doi.org/10.1109/ICIVC.2017.7984706
Samant, P., Agarwal, R.: Machine learning techniques for medical diagnosis of diabetes using iris images. Comput. Methods Prog. Biomed. 157, 121–128 (2018). https://doi.org/10.1016/j.cmpb.2018.01.004
Samant, P., Agarwal, R.: Comparative analysis of classification based algorithms for diabetes diagnosis using iris images. J. Med. Eng. Technol. 42, 35–42 (2018). https://doi.org/10.1080/03091902.2017.1412521
You, J., van der Klein, S.A.S., Lou, E., Zuidhof, M.J.: Application of random forest classification to predict daily oviposition events in broiler breeders fed by precision feeding system. Comput. Electron. Agric. 175, 105526 (2020). https://doi.org/10.1016/j.compag.2020.105526
Burdi, F., Setianingrum, A.H., Hakiem, N.: Application of the Naive Bayes method to a decision support system to provide discounts (case study: PT. Bina Usaha Teknik). In: 2016 6th International Conference on Information and Communication Technology for The Muslim World (ICT4M). Jakarta, pp. 281–285 (2016). https://doi.org/10.1109/ICT4M.2016.064
Akbar, R., Nasution, S.M., Prasasti, A.L.: Implementation of Naive Bayes algorithm on IoT-based smart laundry mobile application system. In: 2020 international conference on information technology systems and innovation (ICITSI). Bandung - Padang, Indonesia, pp. 8–13 (2020). https://doi.org/10.1109/ICITSI50517.2020.9264938
Pandiangan, N., Buono, M.L.C., Loppies, S.H.D.: Implementation of decision tree and Naïve Bayes classification method for predicting study period. J. Phys. Conf. Ser. 1569, 022022 (2020). https://doi.org/10.1088/1742-6596/1569/2/022022
Gomathi, S., Narayani, V.: Monitoring of lupus disease using decision tree induction classification algorithm. In: 2015 International Conference on Advanced Computing and Communication Systems. Coimbatore, India, pp. 1–6 (2015). https://doi.org/10.1109/ICACCS.2015.7324054
Abdar, M., Nasarian, E., Zhou, X., Bargshady, G., Wijayaningrum, V.N., Hussain, S.: Performance improvement of decision trees for diagnosis of coronary artery disease using multi filtering approach. In: 2019 IEEE 4th International Conference on Computer and Communication Systems (ICCCS). Singapore, pp. 26–30 (2019). https://doi.org/10.1109/CCOMS.2019.8821633
Premamayudu, B., et al.: Diabetes prediction using machine learning KNN -algorithm technique. Int. J. Innovative Science Res. Technol. 7(5) (2022)
Jadhav, S.D., Channe, H.P.: Comparative study of K-NN, naive bayes and decision tree classification techniques. Int. J. Sci. Res. 5(1), 1842–1845 (2016)
Wu, X., Wang, S., Zhang, Y.: Review of K nearest neighbor algorithm theory and application. Comput. Eng. Appl. 53(21), 1–7 (2017)
Kuswanto, H., Mubarok, R.: Classification of cancer drug compounds for radiation protection optimization using CART. In : The Fifth Information Systems International Conference (2019)
Shirole, U., Joshi, M., Bagul, P. : Cardiac, diabetic and normal subjects classification using decision tree and result confirmation through orthostatic stress index. Informatics in Medicine Unlocked 17, 100252 (2019)
Xu, W., Jiang, L.: An attribute value frequency-based instance weighting filter for naive Bayes. J. Exp. Theor. Artif. Intell. 31(4), 225–236 (2019)
Svetnik, V., Liaw, A., Tong, C., Culberson, J.C., Sheridan, R.F., Feuston, B.P.: Random forest: a classification and regression tool for compound classification and QSAR modeling. J. Chem. Inf. Comput. Sci. 43(6), 1947–1958 (2003)
Matsumoto, A., Aoki, S., Ohwada, H.: Comparison of random forest and SVM for raw data in drug discovery: prediction of radiation protection and toxicity case study. Int. J. Machine Learning Comput. 6(2), 145–148 (2016)
Zekić-Sušaca, M., Hasa, A., Knežev, M.: Predicting energy cost of public buildings by artificial neural networks, CART, and random forest Forest. Neurocomputing 439, 223-233 (2021)
Butwall, M., Kumar, S. : A data mining approach for the diagnosis of diabetes mellitus using random forest classifier. Int. J. Computer Appl. 120(8) (2015)
Kuswanto, H., Mubarok, R., Ohwada, H.: Classification using naive bayes to predict radiation protection in cancer drug discovery: a case of mixture based grouped data. Int. J. Artificial Intell. 17(1), 186–203 (2019)
Wadiai, Y., Baslam, M.: Machine learning approach to automate decision support on information system attacks. Lecture Notes in Business Information Processing ISBN 978–3–031–06457–9 ISBN 978–3–031–06458–6 (eBook) https://doi.org/10.1007/978-3-031-06458-6
Fakir, Y., Maarouf, A., El Ayachi, R.: Mining frequents itemset and association rules in diabetic dataset. Lecture Notes in Business Information Processing ISBN 978–3–031–06457–9 ISBN 978–3–031–06458–6 (eBook) https://doi.org/10.1007/978-3-031-06458-6
Bair, E., Hastie, T., Paul, D., Tibshirani, R. : Prediction by supervised principal components. J. American Statistical Assoc. 101(473), 119–137 (2006)
Borges, V.R.P., Esteves, S.L., De Nardi Araujo, P., Oliveira, L.C., Holanda, M. : Using Principal Component Analysis to support students’ performance prediction and data analysis, VII Congresso Brasileiro de Informática na Educação (CBIE 2018), Anais do XXIX Simpósio Brasileiro de Informática na Educação (SBIE 2018)
Fakir, Y., Abdelmotalib, N. : Analysis of decision tree algorithms for diabetes prediction. Lecture Notes in Business Information Processing ISBN 978–3–031–06457–9 ISBN 978–3–031–06458–6 (eBook) https://doi.org/10.1007/978-3-031-06458-6
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Fakir, Y. (2023). Diabetes Prediction by Machine Learning Algorithms and Risks Factors. In: El Ayachi, R., Fakir, M., Baslam, M. (eds) Business Intelligence. CBI 2023. Lecture Notes in Business Information Processing, vol 484 . Springer, Cham. https://doi.org/10.1007/978-3-031-37872-0_4
Download citation
DOI: https://doi.org/10.1007/978-3-031-37872-0_4
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-37871-3
Online ISBN: 978-3-031-37872-0
eBook Packages: Computer ScienceComputer Science (R0)