Abstract
Diabetes is a group of non-communicable diseases (NCD) that cannot be cured by current medical technologies and can lead to various serious complications. Significantly reducing the severity of diabetes and its associated risk factors relies on accurate early prediction. Some machine learning algorithms have been developed to assist in predicting diabetes, but their predictions are not always accurate and often lack interpretability. Therefore, further efforts are required to improve these algorithms to achieve the level of clinical application. The aim of this paper is to find a high-performance and interpretable diabetes prediction model. Firstly, the dataset is subjected to necessary preprocessing, including missing value imputation using K-nearest neighbors (KNN) and data balancing using adaptive synthetic sampling (ADASYN). Then, with 10-fold cross validation, the predictive performance of six machine learning algorithms is compared in terms of accuracy, precision, recall, and F1 score. Finally, the prediction results are globally and locally explained using SHapley Additive exPlanations (SHAP) and Local Interpretable Model-agnostic Explanations (LIME). The experimental results demonstrate that the eXtreme Gradient Boosting (XGBoost) algorithm provides the best predictive performance. The visualized eXplainable Artificial Intelligence (XAI) techniques offer valuable explanatory information, helping healthcare professionals and patients better understand the risk and prediction results of diabetes.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
International Diabetes Federation., IDF Diabetes Atlas, 10th edn., International Diabetes Federation (2021). https://diabetesatlas.org/data/en/world/ (Accessed 06 May 2023)
Al Sadi, K., Balachandran, W.: Prediction model of type 2 diabetes mellitus for Oman prediabetes patients using artificial neural network and six machine learning Classifiers. Appl. Sci., 13(4) (2023). https://doi.org/10.3390/app13042344
Sisodia, D., Sisodia, D.S.: Prediction of diabetes using classification algorithms. Procedia Comput. Sci., 132(Iccids), 1578–1585 (2018). https://doi.org/10.1016/j.procs.2018.05.122
Mahboob Alam, T., et al.: A model for early prediction of diabetes. Informatics Med. Unlocked 16, 100204 (2019). https://doi.org/10.1016/j.imu.2019.100204
Tiwari, P., Singh, V.: Diabetes disease prediction using significant attribute selection and classification approach. J. Phys. Conf. Ser., 1714(1) (2021). https://doi.org/10.1088/1742-6596/1714/1/012013
Kibria, H.B., Nahiduzzaman, M., Goni, M.O.F., Ahsan, M., Haider, J.: An Ensemble approach for the prediction of diabetes mellitus using a soft voting classifier with an explainable AI. Sensors 22(19) (2022). https://doi.org/10.3390/s22197268
Chen, W., Chen, S., Zhang, H., Wu, T.: A hybrid prediction model for type 2 diabetes using K-means and decision tree. In: Proceedings of IEEE International Conference on Software Engineering and Service Sciences ICSESS, vol. 2017(61272399), pp. 386–390 (2018). https://doi.org/10.1109/ICSESS.2017.8342938
Ramesh, J., Aburukba, R., Sagahyroon, A.: A remote healthcare monitoring framework for diabetes prediction using machine learning. Healthc. Technol. Lett. 8(3), 45–57 (2021). https://doi.org/10.1049/htl2.12010
Ahmed, U., et al.: Prediction of diabetes empowered with fused machine learning. IEEE Access 10, 8529–8538 (2022). https://doi.org/10.1109/ACCESS.2022.3142097
Fitriyani, N.L., Syafrudin, M., Alfian, G., Rhee, J.: Development of disease prediction model based on ensemble learning approach for diabetes and hypertension. IEEE Access 7, 144777–144789 (2019). https://doi.org/10.1109/ACCESS.2019.2945129
Aamir, K.M., Sarfraz, L., Ramzan, M., Bilal, M., Shafi, J., Attique, M.: A fuzzy rule-based system for classification of diabetes. Sensors 21(23) (2021). https://doi.org/10.3390/s21238095
El-Sappagh, S., Alonso, J.M., Ali, F., Ali, A., Jang, J.H., Kwak, K.S.: An ontology-based interpretable fuzzy decision support system for diabetes diagnosis. IEEE Access 6, 37371–37394 (2018). https://doi.org/10.1109/ACCESS.2018.2852004
Kocbek, S., Kocbek, P., Gosak, L., Fijačko, N., Štiglic, G.: Extracting new temporal features to improve the interpretability of undiagnosed Type 2 diabetes Mellitus Prediction models. J. Pers. Med. 12(3) (2022). https://doi.org/10.3390/jpm12030368
Hao, J., Luo, S., Pan, L.: Rule extraction from biased random forest and fuzzy support vector machine for early diagnosis of diabetes. Sci. Rep. 12(1), 1–12 (2022). https://doi.org/10.1038/s41598-022-14143-8
Du, Y., Rafferty, A.R., McAuliffe, F.M., Wei, L., Mooney, C.: An explainable machine learning-based clinical decision support system for prediction of gestational diabetes mellitus. Sci. Rep. 12(1), 1–14 (2022). https://doi.org/10.1038/s41598-022-05112-2
El-Rashidy, N., ElSayed, N.E., El-Ghamry, A., Talaat, F.M.: Prediction of.gestational diabetes based on explainable deep learning and fog computing. Soft. Comput.Comput. 26(21), 11435–11450 (2022). https://doi.org/10.1007/s00500-022-07420-1
Nagaraj, P., Muneeswaran, V., Dharanidharan, A., Balananthanan, K., Arunkumar, M., Rajkumar, C.: A prediction and recommendation system for diabetes mellitus using XAI-based lime explainer,” International Conference on Sustainable Computing and Data Communication Systems ICSCDS 2022 - Proc.eedings, pp. 1472–1478 (2022). https://doi.org/10.1109/ICSCDS53736.2022.9760847
Tasin, I., Nabil, T.U., Islam, S., Khan, R.: Diabetes prediction using machine learning and explainable AI techniques. Healthc. Technol. Lett., 1–10 (2022). https://doi.org/10.1049/htl2.12039
Assegie, T.A., Karpagam, T., Mothukuri, R., Tulasi, R.L., Engidaye, M.F.: Extraction of human understandable insight from machine learning model for diabetes prediction. Bull. Electr. Eng. Informatics 11(2), 1126–1133 (2022). https://doi.org/10.11591/eei.v11i2.3391
Zhao, X., Jiang, C.: The prediction of distant metastasis risk for male breast cancer patients based on an interpretable machine learning model. BMC Med. Inform. Decis. Mak.Decis. Mak. 23(1), 74 (2023). https://doi.org/10.1186/s12911-023-02166-8
Technique, A.O., et al.: DAD-Net : Classification of Alzheimer ’ s Disease Using Neural Network, pp. 1–21 (2022)
Noorunnahar, M., Chowdhury, F.A., Mila, A.H.: A tree based eXtreme Gradient Boosting ( XGBoost ) machine learning model to forecast the annual rice production in Bangladesh, pp. 1–15 (2023). https://doi.org/10.1371/journal.pone.0283452
Nohara, Y., Matsumoto, K., Soejima, H., Nakashima, N.: Explanation of machine learning models using shapley additive explanation and application for real data in hospital. Comput. Methods Programs Biomed. 214 (2022). http://10.0.3.248/j.cmpb.2021.106584
Kumari, S., Kumar, D., Mittal, M.: An ensemble approach for classification and prediction of diabetes mellitus using soft voting classifier. Int. J. Cogn. Comput. Eng. 2, 40–46 (2021). https://doi.org/10.1016/j.ijcce.2021.01.001
Acknowledgement
The authors would like to thank the Universiti Kebangsaan Malaysia for support-ing this work through Geran Galakan Penyelidik Muda (GGPM-2022-063) and re-search incentives TAP-K024478.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2024 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Zhao, Y., Chaw, J.K., Ang, M.C., Daud, M.M., Liu, L. (2024). A Diabetes Prediction Model with Visualized Explainable Artificial Intelligence (XAI) Technology. In: Badioze Zaman, H., et al. Advances in Visual Informatics. IVIC 2023. Lecture Notes in Computer Science, vol 14322. Springer, Singapore. https://doi.org/10.1007/978-981-99-7339-2_52
Download citation
DOI: https://doi.org/10.1007/978-981-99-7339-2_52
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-99-7338-5
Online ISBN: 978-981-99-7339-2
eBook Packages: Computer ScienceComputer Science (R0)