skip to main content
10.1145/3647444.3652479acmotherconferencesArticle/Chapter ViewAbstractPublication PagesicimmiConference Proceedingsconference-collections
research-article

Machine Learning approach for Diabetes Prediction using Pima Dataset

Published: 13 May 2024 Publication History

Abstract

The rising cases of diabetes globally have called for effective prediction and early detection techniques. This study explores the use of machine learning methods to identify this condition in the PIMA diabetes dataset. The research is focused on analyzing the algorithms K-Nearest Neighbors, Logistic Regression, Decision Trees, Random Forest, and XGBoost. The objective of this study is to analyze the performance of the various algorithms used in predicting diabetes. The data collected from PIMA, which consists of diagnostic and clinical measurements, is the primary source of the study. All of the algorithms are then tested and trained on this dataset to improve their precision, F1-Score, recall, and accuracy. Moreover, cross-validation and hyperparameter tuning techniques are utilized to enhance the performance of the algorithms. The findings of this study provide valuable information on the algorithms' effectiveness when it comes to identifying diabetes. Among the tested algorithms, XGBoost performed well and consistently achieved high precision, F1-Score, recall, and accuracy. It has been concluded that this algorithm is the most suitable for identifying diabetes in the PIMA dataset. Different levels of performance were exhibited by the different algorithms. This provided a comprehensive analysis of their weaknesses and strengths in predicting diabetes. The findings of this study highlight the significance of utilizing machine learning methods in predicting diabetes. It also identified XGBoost as the best performer among the evaluated systems. The findings prove valuable in helping develop effective tools for early detection of diabetes, ultimately leading to better healthcare outcomes for those at risk.

References

[1]
Ahamed, K. U., Islam, M., Uddin, A., Akhter, A., Paul, B. K., Yousuf, M. A., Moni, M. A., (2021). A deep learning approach using effective preprocessing techniques to detect covid-19 from chest CT-scan and X-ray images. Computers in Biology and Medicine, 139, Article 105014. 10.1016/j.compbiomed.2021.105014.
[2]
Albahli, S. (2020). Type 2 machine learning: An effective hybrid prediction model for early type 2 diabetes detection. Journal of Medical Imaging and Health Informatics, 10, 1069–1075323
[3]
Khetani, V. ., Gandhi, Y. ., Bhattacharya, S. ., Ajani, S. N. ., & Limkar, S. . (2023). Cross-Domain Analysis of ML and DL: Evaluating their Impact in Diverse Domains. International Journal of Intelligent Systems and Applications in Engineering, 11(7s), 253–262. Retrieved from https://ijisae.org/index.php/IJISAE/article/view/2951
[4]
Choubey, D.K., Paul, S., Kumar, S., Kumar, S., 2017. Classification of Pima indian diabetes dataset using naive bayes with genetic algorithm as an attribute selection, in: Communication and Computing Systems: Proceedings of the International Conference on Communication and Computing System (ICCCS 2016), pp. 451– 455.
[5]
Ajani, S.N., Mulla, R.A., Limkar, S. DLMBHCO: design of an augmented bioinspired deep learning-based multidomain body parameter analysis via heterogeneous correlative body organ analysis. Soft Comput (2023). https://doi.org/10.1007/s00500-023-08613-y
[6]
VijiyaKumar, K., Lavanya, B., Nirmala, I., Caroline, S.S.: Random forest algorithm for the prediction of diabetes. In: International Conference on System, Computation, Automation and Networking, pp. 1–5 (2019)
[7]
Borkar, P., Wankhede, V.A., Mane, D.T. Deep learning and image processing-based early detection of Alzheimer disease in cognitively normal individuals. Soft Comput (2023). https://doi.org/10.1007/s00500-023-08615-w
[8]
Mohan, N., Jain, V.: Performance analysis of support vector machine in diabetes prediction. In: International Conference on Electronics, Communication and Aerospace Technology, pp. 1–3 (2020)
[9]
Smith, J.W., Everhart, J.E., Dickson, W.C., Knowler, W.C., Johannes, R.S.: Using the ADAP learning algorithm to forecast the onset of diabetes mellitus. In: Annual Symposium on Computer Applications in Medical Care pp. 261–265 (1998)
[10]
Hasan, M.K., Alam, M.A., Das, D., Hossain, E., Hasan, M.: Diabetes prediction using ensembling of different machine learning classifiers. IEEE Access 8, 76516–76531, (2020)
[11]
S. Ajani and M. Wanjari, "An Efficient Approach for Clustering Uncertain Data Mining Based on Hash Indexing and Voronoi Clustering," 2013 5th International Conference and Computational Intelligence and Communication Networks, 2013, pp. 486-490.
[12]
Jackins, V., Vimal, S., Kaliappan, M., Lee, M.Y.: AI-based smart prediction of clinical disease using random forest classifier and Naive Bayes. J. Supercomput. 77, 5198–5219 (2021)
[13]
Mounika, V., Neeli, D.S., Sree, G.S., Mourya, P., Babu, M.A.: Prediction of type-2 diabetes using machine learning algorithms. In: International Conference on Artificial Intelligence and Smart Systems, pp. 127–131 (2021)
[14]
Kumari, S., Kumar, D., Mittal, M.: An ensemble approach for classification and prediction of diabetes mellitus using soft voting classifier. Int. J. Cognit. Comput. Eng. 2, 40–46 (2021)
[15]
Prabhu, P., Selvabharathi, S.: Deep belief neural network model for prediction of diabetes mellitus. In: International Conference on Imaging, Signal Processing and Communication, pp. 138–142 (2019)
[16]
Chatrati, S.P., Hossain, G., Goyal, A., : Smart home health monitoring system for predicting type 2 diabetes and hypertension. J. King Saud Univ. Comput. Inf. Sci. 34(3), 862–870 (2020)
[17]
Kaur, H., &Kumari, V. (2020). Predictive modelling and analytics for diabetes using a machine learning approach. Applied Computing and Informatics. 10.1016/j.aci.2018.12.004.
[18]
Kazerouni, F., Bayani, A., Asadi, F., Saeidi, L., Parvizi, N., &Mansoori, Z. (2020). Type2 diabetes mellitus prediction using data mining algorithms based on the long-noncoding rnas expression: A comparison of four data mining approaches. BMC Bioinformatics, 21, 1–13.
[19]
Kopitar, L., Kocbek, P., Cilar, L., Sheikh, A., &Stiglic, G. (2020). Early detection of type 2 diabetes mellitus using machine learning-based prediction models. Scientific Reports, 10, 1–12.
[20]
Maniruzzaman, M., Rahman, M. J., Ahammed, B., &Abedin, M. M. (2020). Classification and prediction of diabetes disease using machine learning paradigm. Health Information Science and Systems, 8, 1–14.
[21]
Yahyaoui, A., Jamil, A., Rasheed, J., &Yesiltepe, M. (2019). A decision support system for diabetes prediction using machine learning and deep learning techniques. In Proceedings of the 1st international informatics and software engineering conference (UBMYK) (pp. 1–4).
[22]
Smith, J.W., Everhart, J.E., Dickson, W.C., Knowler, W.C., Johannes, R.S.: Using the ADAP learning algorithm to forecast the onset of diabetes mellitus. In: Annual Symposium on Computer Applications in Medical Care pp. 261–265 (1998)
[23]
Chatrati, S.P., Hossain, G., Goyal, A., : Smart home health monitoring system for predicting type 2 diabetes and hypertension. J. King Saud Univ. Comput. Inf. Sci. 34(3), 862–870 (2020)
[24]
Hasan, M.K., Alam, M.A., Das, D., Hossain, E., Hasan, M.: Diabetes prediction using ensembling of different machine learning classifiers. IEEE Access 8, 76516–76531, (2020)
[25]
Cervantes, J., García-Lamont, F., Rodríguez, L., Lopez-Chau, A.: A comprehensive survey on support vector machine classification: Applications, challenges and trends. Neurocomputing 408, 189–215 (2020)
[26]
Pranto, B., : Evaluating machine learning methods for predicting diabetes among female patients in Bangladesh. Information 11, 1–20 (2020)
[27]
He, H., Bai, Y., Garcia, E.A., Li, S.: ADASYN: Adaptive synthetic sampling approach for imbalanced learning. In: International Joint Conference on Neural Networks (IEEE World Congress on Computational Intelligence, pp. 1322– 1328 (2008)
[28]
Haq, A. U., Li, J. P., Khan, J., Memon, M. H., Nazir, S., Ahmad, S., (2020). Intelligent machine learning approach for effective recognition of diabetes in e-healthcare using clinical data. Sensors, 20, 2649.
[29]
Joshi, T. N., &Chawan, P. (2018). Diabetes prediction using machine learning techniques. International Journal of Engineering Research and Applications, 8, 9–13.
[30]
Kaur, H., &Kumari, V. (2020). Predictive modelling and analytics for diabetes using a machine learning approach. Applied Computing and Informatics. 10.1016/j.aci.2018.12.004.

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences
ICIMMI '23: Proceedings of the 5th International Conference on Information Management & Machine Intelligence
November 2023
1215 pages
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 13 May 2024

Permissions

Request permissions for this article.

Check for updates

Qualifiers

  • Research-article
  • Research
  • Refereed limited

Conference

ICIMMI 2023

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 62
    Total Downloads
  • Downloads (Last 12 months)62
  • Downloads (Last 6 weeks)6
Reflects downloads up to 05 Mar 2025

Other Metrics

Citations

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format.

HTML Format

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media