Skip to main content

An Effective Feature Selection for Diabetes Prediction

  • Conference paper
  • First Online:
Database and Expert Systems Applications - DEXA 2023 Workshops (DEXA 2023)

Abstract

With the rapid advancement of technology and the ever increasing amount of data in the healthcare domain, big data analytics has become a significant study area. Analyzing patterns in patient treatment for the early detection and diagnosis of diseases can improve overall healthcare quality. Machine learning has emerged as a promising technology for aiding clinicians in making accurate diagnosis decisions. In this paper, we aim to propose an approach through the feature selection technique and employing various ML algorithms such as GBDT, NB, K-NN, SVM, LR, RF, and DT that will identify the subset of features relevant to the prediction of diabetes disease. The performance of each algorithm is evaluated using the Pima Indians Diabetes Dataset and Korean National Health and Nutrition Dataset. Experimental results show that the GBDT algorithm performs the best in predicting the disease with the highest accuracy.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 49.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 64.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Sun, Y., Zhang, D.: Machine learning techniques for screening and diagnosis of diabetes: a survey. Teh. Vjesn. 26, 872–880 (2019)

    Google Scholar 

  2. Ndisang, J.F., Vannacci, A., Rastogi, S.: Insulin resistance, type 1 and type 2 diabetes, and related complications 2017. J. Diabetes Res. 2017, e1478294 (2017). [PubMed]

    Google Scholar 

  3. Malik, S., Harous, S., El-Sayed, H.: Comparative analysis of machine learning algorithms for early prediction of diabetes mellitus in women. In: Chikhi, S., Amine, A., Chaoui, A., Saidouni, D.E., Kholladi, M.K. (eds.) MISC 2020. LNNS, vol. 156, pp. 95–106. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-58861-8_7

    Chapter  Google Scholar 

  4. Himsworth, H.P., Kerr, R.B.: Insulin-sensitive and insulin-insensitive types of diabetes mellitus. Clin. Sci. 4, 119–152 (1939)

    Google Scholar 

  5. World Health Organization, 2020 World Health Organization. https://www.who.int/news-room/fact-sheets/detail/diabetes. Accessed 8 June 2020

  6. Theera-Umpon, N., Poonkasem, I., Auephanwiriyakul, S., Patikulsila, D.: Hard exudate detection in retinal fundus images using supervised learning. Neural Comput. Appl. 32(17), 13079–13096 (2019). https://doi.org/10.1007/s00521-019-04402-7

    Article  Google Scholar 

  7. Afzali, S., Yildiz, O.: An effective sample preparation method for diabetes prediction. Int. Arab J. Inf. Technol. 15(6), 968–973 (2018)

    Google Scholar 

  8. Jaiswal, V., Negi, A., Pal, T.: A review on current advances in machine learning based diabetes prediction. Prim. Care Diabetes 15, 435–443 (2021)

    Google Scholar 

  9. Tariq, H., Rashid, M., Javed, A., Zafar, E., Alotaibi, S.S., Zia, M.Y.I.: Performance analysis of deep-neural-network-based automatic diagnosis of diabetic retinopathy. Sensors 22, 205 (2022)

    Google Scholar 

  10. Kumar, D., et al.: Automatic detection of white blood cancer from bone marrow microscopic images using convolutional neural networks. IEEE Access 8, 142521–142531 (2020)

    Google Scholar 

  11. Khaleel, F.A., Al-Bakry, A.M.:Diagnosis of diabetes using machine learning algorithms. Mater. Today: Proc. (2021)

    Google Scholar 

  12. Saxena, R., Sharma, S.K., Gupta, M., Sampada, G.C.: A comprehensive review of various diabetic prediction models: a literature survey. J. Healthc. Eng. 2022, e8100697 (2022). [PubMed]

    Google Scholar 

  13. Chatrati, S.P., et al.: Smart home health monitoring system for predicting type 2 diabetes and hypertension. J. King Saud Univ.—Comput. Inf. Sci. 34, 862–870 (2020)

    Google Scholar 

  14. Goyal, P., Jain, S.: Prediction of type-2 diabetes using classification and ensemble method approach. In: Proceedings of the 2022 International Mobile and Embedded Technology Conference (MECON), Noida, India, pp. 658–665, 10–11 March 2022

    Google Scholar 

  15. Prakash, A.: An ensemble technique for early prediction of type 2 diabetes mellitus—a normalization approach. Turk. J. Comput. Math. Educ. 12, 9 (2021)

    Google Scholar 

  16. Chang, V., et al.: Pima Indians diabetes mellitus classification based on machine learning (ML) algorithms.Neural Comput. Appl., 1-17 (2022)

    Google Scholar 

  17. Jackins, V., Vimal, S., Kaliappan, M., Lee, M.Y.: AI-based smart prediction of clinical disease using random forest classifier and Naive Bayes. J. Supercomput. 77(5), 5198–5219 (2020). https://doi.org/10.1007/s11227-020-03481-x

    Article  Google Scholar 

  18. Sneha, N., Tarun, G.: Analysis of diabetes mellitus for early prediction using optimal feature selection. J. Big data 6, 3 (2019)

    Article  Google Scholar 

  19. Kamrul Hasan, M., Ashraful Alam, M., Das, D., Hussain, E., Hasan, M.: Diabetes prediction using ensembling of different machine learning classifiers.IEEE Access 8 (2020). Article ID: 76531

    Google Scholar 

  20. Saxena, R., Sharma, S.K., Gupta, M., Sampada, G.C.: A novel approach for feature selection and classification of diabetes mellitus: machine learning methods. Comput. Intell. Neurosci. 2022, e3820360 (2022)

    Google Scholar 

  21. Korea Centers for Disease Control and Prevention. https://knhanes.kdca.go.kr/knhanes/sub03/sub03_02_05.do

  22. Khaire, U.M., Dhanalakshmi, R.: Stability of feature selection algorithm: a review. J. King Saud Univ. Comput. Inf. Sci. (2019)

    Google Scholar 

  23. Gao, Z., Xu, Y., Meng, F., Qi, F., Lin, Z.: Improved information gain-based feature selection for text categorization. In: Proceedings of the 2014 4th International Conference on Wireless Communications, Vehicular Technology, Information Theory and Aerospace Electronic Systems (VITAE), IEEE, Aalborg, Denmark, pp. 1–5, 11–14 May 2014

    Google Scholar 

  24. Li, J., et al.: Feature selection: a data perspective. ACM Comput. Surv. 50, 1–45 (2017)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jeong-Dong Kim .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Kang, Ia., Ngnamsie Njimbouom, S., Kim, JD. (2023). An Effective Feature Selection for Diabetes Prediction. In: Kotsis, G., et al. Database and Expert Systems Applications - DEXA 2023 Workshops. DEXA 2023. Communications in Computer and Information Science, vol 1872. Springer, Cham. https://doi.org/10.1007/978-3-031-39689-2_10

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-39689-2_10

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-39688-5

  • Online ISBN: 978-3-031-39689-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics