Skip to main content

EarlyStage Diabetes Risk Detection Using Comparison of Xgboost, Lightgbm, and Catboost Algorithms

  • Conference paper
  • First Online:
Advanced Information Networking and Applications (AINA 2024)

Abstract

Diabetes Mellitus is a chronic metabolic disorder that elevates blood glucose levels due to insufficient insulin production or insulin resistance. This disease has significant and increasing global impact. Numerous studies have investigated the effects of using classification algorithms to detect and prevent diabetes mellitus. However, existing research faces various challenges in achieving optimal performance and efficient detection time. This study selects the LightGBM, XGBoost, and CatBoost methods for classification. These three algorithms are trained using the “early-stage diabetes risk prediction dataset” and their results are compared to determine which algorithm is best for classifying early-stage diabetes. The testing results indicate that the model trained using the CatBoost algorithm demonstrates superior performance with higher accuracy, precision, recall, F1-Score, and ROC-AUC scores compared to models trained with XGBoost and LightGBM. Additionally, the LightGBM algorithm exhibits faster computational time compared to XGBoost and CatBoost.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 149.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 279.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. International Diabetes Federation (IDF): IDF Diabetes Atlas 10th edition (2021). www.diabetesatlas.org

  2. Chen, T., Guestrin, C.: XGBoost: A Scalable Tree Boosting System (2016). https://doi.org/10.1145/2939672.2939785

  3. Ke, G., et al.: LightGBM: A Highly Efficient Gradient Boosting Decision Tree. https://github.com/Microsoft/LightGBM.

  4. Kinnander, M.: Predicting Profitability Of New Customers Using Gradient Boosting Tree Models

    Google Scholar 

  5. Prokhorenkova, L., Gusev, G., Vorobev, A., Dorogush, A.V., Gulin, A.: CatBoost: unbiased boosting with categorical features. https://github.com/catboost/catboost

  6. Dorogush, A.V., Ershov, V., Gulin, A.: CatBoost: gradient boosting with categorical features support (2018). http://arxiv.org/abs/1810.11363

  7. Huang, G., et al.: Evaluation of CatBoost method for prediction of reference evapotranspiration in humid regions. J. Hydrol. 574, 1029–1041 (2019). https://doi.org/10.1016/j.jhydrol.2019.04.085

    Article  Google Scholar 

  8. Balamurugan, P., Amudha, T., Satheeshkumar, J., Somam, M.: Optimizing neural network parameters for effective classification of benign and malicious websites. J. Phys. Conf. Ser. 1, 2021 (1998). https://doi.org/10.1088/1742-6596/1998/1/012015

    Article  Google Scholar 

  9. Riza, H., Santoso, E.W., Tejakusuma, I.G., Prawiradisastra, F., Prihartanto, P.: Utilization of artificial intelligence to improve flood disaster mitigation. J. Sains dan Teknol. Mitigasi Bencana 15(1), 1–11 (2020). https://doi.org/10.29122/jstmb.v15i1.4145

    Article  Google Scholar 

  10. Hidayat, M.A., Husni, N.L., Damsi, F.: Image processing based flood detector using convolutional neural network (CNN) within surveillance camera Pendeteksi Banjir dengan image processing berbasis convolutional neural network (CNN) pada Kamera Pengawas, vol. 2, no. October, pp. 10–18 (2022)

    Google Scholar 

  11. Alagoz, B.B., Simsek, O.I., Ari, D., Tepljakov, A., Petlenkov, E., Alimohammadi, H.: An evolutionary field theorem: evolutionary field optimization in training of power-weighted multiplicative neurons for nitrogen oxides-sensitive electronic nose applications. Sensors 22(10) (2022). https://doi.org/10.3390/s22103836

  12. Sadollah, A., Eskandar, H., Lee, H.M., Yoo, D.G., Kim, J.H.: Water cycle algorithm: a detailed standard code. SoftwareX 5, 37–43 (2015). https://doi.org/10.1016/j.softx.2016.03.001

    Article  Google Scholar 

  13. Vinayakumar, R., Alazab, M., Soman, K.P., Poornachandran, P., Al-Nemrat, A., Venkatraman, S.: Deep learning approach for intelligent intrusion detection system. IEEE Access 7, 41525–41550 (2019). https://doi.org/10.1109/ACCESS.2019.2895334

    Article  Google Scholar 

  14. Rathor, A.: A Review at Machine Learning Algorithms Targeting Big Data Challenges, pp. 753–759 (2017)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Henny Febriana Harumy .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2024 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Harumy, H.F., Hardi, S.M., Al Banna, M.F. (2024). EarlyStage Diabetes Risk Detection Using Comparison of Xgboost, Lightgbm, and Catboost Algorithms. In: Barolli, L. (eds) Advanced Information Networking and Applications. AINA 2024. Lecture Notes on Data Engineering and Communications Technologies, vol 203. Springer, Cham. https://doi.org/10.1007/978-3-031-57931-8_2

Download citation

Publish with us

Policies and ethics