Skip to main content
Log in

Predicting chronic obstructive pulmonary disease (COPD) with optimized machine learning via leveraging comparative analysis of XGBoost and catboost

  • Original Research
  • Published:
Journal of Ambient Intelligence and Humanized Computing Aims and scope Submit manuscript

Abstract

Chronic obstructive pulmonary disease (COPD) affects the health of millions of people worldwide. In this regard, this research tried to present a concept showing that optimized ML models can predict COPD based on the Exasen dataset. Further, the stochastic gamma process was explained as a continuous-time model with gamma-distributed increments and the compound Poisson process for modeling random jumps in Poisson events because of their relevance to modeling irregular patterns. The two algorithms that will be used in this study are extreme Gradient Boosting Classification, or XGBC, and CAT Boost Classification, CAT, both enhanced by the Artificial Rabbit Optimizer, or ARO, for hyperparameter tuning. The performances were measured concerning accuracy, Precision, Recall, and F1 score for COPD prediction. Surprisingly, both optimized models carried out excellent performance in COPD prediction; especially, the XGAR reached more than 0.910 accuracy in the training phase. Each model has a characteristic. Though XGBC yielded slightly high accuracy, the computational resource can be huge. In comparison, CAT had competitive results with XGBC, maybe with faster training times. The results here suggest that optimized XGBC and CAT are promising for COPD prediction on the Exasen dataset. Further studies will be required to confirm these results, especially for clinical applicability and generalizability across various populations. The contribution of this study involves the new application of ARO for the hyperparameter tuning of COPD prediction models with significant enhancements in the accuracies and performances of both XGBC and CAT algorithms on the Exasen dataset. ARO provides enhanced predictive capability by optimizing the critical model parameters and hence has the potential to enhance the effectiveness of ML in medical diagnostics. This work underlines the prospect of ML models with advanced optimization techniques for the betterment of COPD diagnosis, hence helping in its management and personalized treatment.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+
from $39.99 /Month
  • Starting from 10 chapters or articles per month
  • Access and download chapters and articles from more than 300k books and 2,500 journals
  • Cancel anytime
View plans

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10

Similar content being viewed by others

Data availability

The authors do not have permission to share data.

References

  • Agustí A, Bartolome RC, Gerard JC, David H, Antonio A, Peter B, Jean B, MeiLan KH, Fernando JM, Maria MO (2023) Global initiative for chronic obstructive lung disease 2023 report: GOLD executive summary. Am J Respir Crit Care Med 207(7):819–837

    Article  Google Scholar 

  • Balasubramaniam S, Sumina S, Kumar KS, Prasanth A (2024) Machine learning based models for implementing digital twins in healthcare industry. In: Arai K (ed) Metaverse technologies in healthcare. Elsevier, pp 135–162

  • Celli BR, Fabbri LM, Aaron SD, Agusti A, Brook R, Criner GJ, Franssen FME, Humbert M, Hurst JR, O‘Donnell D (2021) An updated definition and severity classification of chronic obstructive pulmonary disease exacerbations: the Rome proposal. Am J Respir Crit Care Med 204(11):1251–1258

    Article  Google Scholar 

  • Chen T, Guestrin C (2016) Xgboost: a scalable tree boosting system. In: Proceedings of the 22nd Acm Sigkdd international conference on knowledge discovery and data mining. pp 785–94

  • Chen Y, Yonglin Y, Dongmei Y, Wenbo Z, Vasileios K, Xiaoju C (2024) Developing and validating machine learning-based prediction models for frailty occurrence in those with chronic obstructive pulmonary disease. J Thorac Dis 16(4):2482

    Article  Google Scholar 

  • Dally EC, Rekha BB (2024) Automated chronic obstructive pulmonary disease (COPD) detection and classification using mayfly optimization with deep belief network model. Biomed Signal Process Control 96:106488

    Article  Google Scholar 

  • Effros RM, Peterson B, Casaburi R, Su J, Dunning M, Torday J, Biller J, Shaker R (2005) Epithelial lining fluid solute concentrations in chronic obstructive lung disease patients and normal subjects. J Appl Physiol 99(4):1286–1292

    Article  Google Scholar 

  • Fischer AM, Varga-Szemes A, Martin SS, Sperl JI, Sahbaee P, Neumann D, Gawlitza J, Henzler T, Johnson CM, JW Nance (2020) Artificial intelligence-based fully automated per lobe segmentation and emphysema-quantification based on chest computed tomography compared with global initiative for chronic obstructive lung disease severity of smokers. J Thorac Imaging 35:S28–34

    Article  Google Scholar 

  • Glyde HMG, Morgan C, Wilkinson TMA, Nabney IT, Dodd JW (2024) Remote patient monitoring and machine learning in acute exacerbations of chronic obstructive pulmonary disease: dual systematic literature review and narrative synthesis. J Med Internet Res 26:e52143

    Article  Google Scholar 

  • Gorse GJ, O’Connor TZ, Hall SL, Vitale JN, Nichol KL (2009) Human coronavirus and acute respiratory illness in older adults with chronic obstructive pulmonary disease. J Infect Dis 199(6):847–857

    Article  Google Scholar 

  • Haroon S, Jordan R, Takwoingi Y, Adab P (2015) Diagnostic accuracy of screening tests for COPD: a systematic review and meta-analysis. BMJ Open 5(10):1–10

  • Himelhoch S, Lehman A, Kreyenbuhl J, Daumit G, Brown C, L Dixon (2004) Prevalence of chronic obstructive pulmonary disease among those with serious mental illness. Am J Psychiatry 161(12):2317–2319

    Article  Google Scholar 

  • Hnizdo E, Glindmeyer HW, Petsonk EL, Enright P, Buist AS (2006) Case definitions for chronic obstructive pulmonary disease. COPD J Chronic Obstr Pulm Dis 3(2):95–100

    Article  Google Scholar 

  • Hogg JC, Chu F, Utokaparch S, Woods R, Elliott WM, Buzatu L, Cherniack RM, Rogers RM, Sciurba FC, Coxson HO (2004) The nature of small-airway obstruction in chronic obstructive pulmonary disease. N Engl J Med 350(26):2645–2653

    Article  Google Scholar 

  • Hoth KF, Wamboldt FS, Bowler R, Make B, KE Holm (2011) Attributions about cause of illness in chronic obstructive pulmonary disease. J Psychosom Res 70(5):465–472

    Article  Google Scholar 

  • Jayanthi G, Archana E, Saravanan R, Swaminathan A, Sai CN (2024) Comparative analysis of psychological stress detection: a study of artificial neural networks and Cat boost algorithm. Int J Intell Syst Appl Eng 12(1):385–394

    Google Scholar 

  • Karadag F, Ozcan H, Karul AB, Yilmaz M, Cildag O (2007) Correlates of non-thyroidal illness syndrome in chronic obstructive pulmonary disease. Respir Med 101(7):1439–1446

    Article  Google Scholar 

  • Li Y, Wang X, Qiao Y, Ren J, Ren H, Cui Y, Liu J, Zhao R, Qiu L (2023) Performance comparison of improved machine learning algorithms based on Bayesian optimization in high-dimensional and unbalanced COPD data. https://doi.org/10.21203/rs.3.rs-3239086/v1

  • MacLeod M, Papi A, Contoli M, Beghé B, Celli BR, Wedzicha JA, Fabbri LM (2021) Chronic obstructive pulmonary disease exacerbation fundamentals: diagnosis, treatment, prevention and disease impact. Respirology 26(6):532–551

    Article  Google Scholar 

  • Makimoto K, Au R, Moslemi A, Hogg JC, Bourbeau J, Tan WC, M Kirby (2023) Comparison of feature selection methods and machine learning classifiers for predicting chronic obstructive pulmonary disease using texture-based CT lung radiomic features. Acad Radiol 30(5):900–910

    Article  Google Scholar 

  • Maleki-Yazdi MR, Kelly SM, Lam SS, Marin M, Barbeau M, V Walker (2012) The burden of illness in patients with moderate to severe chronic obstructive pulmonary disease in Canada. Can Respir J 19:319–324

    Article  Google Scholar 

  • Mannino DM, Watt G, Hole D, Gillis C, Hart C, McConnachie A, Davey Smith G, Upton M, Hawthorne V, Sin DD (2006) The natural history of chronic obstructive pulmonary disease. Eur Respir J 27(3):627–643

    Article  Google Scholar 

  • O’Neill ES (2002) Illness representations and coping of women with chronic obstructive pulmonary disease: a pilot study. Heart Lung 31(4):295–302

    Article  Google Scholar 

  • Peng J, Liu X, Cai Z, Huang Y, Lin J, Zhou M, Xiao Z, Lai H, Cao Z, H Peng (2024) Practice of distributed machine learning in clinical modeling for chronic obstructive pulmonary disease. Heliyon 10:13

    Article  Google Scholar 

  • Prasanth A, Lakshmi D, Dhanaraj RK, Balusamy B, Sherimon PC (2024) Technological advancement in internet of medical things and blockchain for personalized healthcare: applications and use cases. CRC, Boca Raton

    Book  Google Scholar 

  • Riad AJ, Hasanien HM, Turky RA, Yakout AH (2023) Identifying the PEM fuel cell parameters using artificial rabbits optimization algorithm. Sustainability 15(5):4625

    Article  Google Scholar 

  • Rycroft CE, Heyes A, Lanza L, Becker K (2012) Epidemiology of chronic obstructive pulmonary disease: a literature review. Int J Chronic Obstr Pulm Dis 7:457–494

  • Scharloo M, Kaptein AA, Schlösser M, Pouwels H, Bel EH, Rabe KF, EFM Wouters (2007) Illness perceptions and quality of life in patients with chronic obstructive pulmonary disease. J Asthma 44(7):575–581

    Article  Google Scholar 

  • Subramanian R, Aruchamy P (2024) An effective speech emotion recognition model for multi-regional languages using threshold-based feature selection algorithm. Circuits Syst Signal Process 43(4):2477–2506

    Article  Google Scholar 

  • Wang Y (2022) Personality type prediction using decision tree, Gbdt, and Cat Boost. In: 2022 international conference on big data, information and computer network (BDICN). IEEE, pp 552–558

  • Wang L, Cao Q, Zhang Z, Mirjalili S, Zhao W (2022) Artificial rabbits optimization: a new bio-inspired meta-heuristic algorithm for solving engineering optimization problems. Eng Appl Artif Intell 114:105082

    Article  Google Scholar 

  • Yin H, Wang K, Yang R, Tan Y, Li Q, Zhu W, Sung S (2024) A machine learning model for predicting acute exacerbation of in-home chronic obstructive pulmonary disease patients. Comput Methods Prog Biomed 246:108005

    Article  Google Scholar 

  • Yu L, Ruan X, Huang W, Huang N, Zeng J, He J, He R, K Yang (2024) Machine learning-based prediction of in-hospital mortality in patients with pneumonic chronic obstructive pulmonary disease exacerbations. J Asthma 61(3):212–221

    Article  Google Scholar 

  • Zhang B, Wang J, Chen J, Ling Z, Ren Y, Xiong D, L Guo (2023) Machine learning in chronic obstructive pulmonary disease. Chin Med J 136(05):536–538

    Google Scholar 

  • Zhu Z, Zhao S, Li J, Wang Y, Xu L, Jia Y, Li Z, Li W, Chen G, Wu X (2024) Development and application of a deep learning-based comprehensive early diagnostic model for chronic obstructive pulmonary disease. Respir Res 25(1):167

    Article  Google Scholar 

Download references

Acknowledgements

This work was supported by the National Natural Science Foundation of China (Grant number 81960804) and the Guangxi Natural Science Foundation project (Grant number 2023GXNSFAA026096).

Author information

Authors and Affiliations

Corresponding author

Correspondence to Ying Jiang.

Ethics declarations

Conflict of interest

The authors declare no competing interests.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Feng, Y., Jiang, F., Lu, D. et al. Predicting chronic obstructive pulmonary disease (COPD) with optimized machine learning via leveraging comparative analysis of XGBoost and catboost. J Ambient Intell Human Comput 16, 613–627 (2025). https://doi.org/10.1007/s12652-025-04967-3

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s12652-025-04967-3

Keywords