Abstract
Breast cancer is one of the most frequent malignancies in women and accounts for a disproportionate number of new cancer cases and deaths worldwide. Doctors explored many options to predict and diagnose breast cancer in the early stages, while resulting early identification improves prognosis and survival. Recently, machine learning approaches are widely used in breast cancer pattern categorization and forecast modelling to identify important attributes of disease. To automatically predict whether breast cancer cells would be malignant or benign, this research proposes an enhanced version of the XGBoost ensembling algorithm called I-XGBoost. To improve identification accuracy, the suggested study considers three crucial phases: data pre-treatment, feature extraction, and target role. The performances are conducted using a standard dataset for Wisconsin Breast Cancer Diagnostics. Furthermore, it is compared to different classification techniques in terms of precision, recall, f1-score and accuracy, including Support Vector Machine (SVM), Logistic Regression (LR), K-Nearest Neighbours (KNN), Naive Bayes (NB), Decision Tree (DT), Random Forest (RF), AdaBoost, and XGBoost. At the end of the day, it observed that I-XGBoost achieves an impressively high accuracy score of 98.24%, while the Logistic Regression classifier reaches an accuracy score of 97%, which is maximized up to +1.24% from state of the art.










Similar content being viewed by others
Data availability
All the data and the supplementary material can be made available from the corresponding author, upon reasonable request.
Code availability
Codes used in this study can be made available from the corresponding author, upon reasonable request.
References
Sung H, Ferlay J, Siegel RL, Laversanne M, Soerjomataram I, Jemal A, Bray F (2021) Global cancer statistics 2020: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries. CA Cancer J Clin 71(3):209–249
Siegel RL, Miller KD, Fuchs HE, Jemal A (2022) Cancer statistics, 2022. CA Cancer J Clin 72(1):7–33
Omran AR (1971) The Epidemiological Transition: A Theory of the Epidemiology of Population Change. Millbank Memorial Fund Quarterly 49:509–538
Gersten O, Wilmoth JR (2002) The cancer transition in Japan since 1951. Demogr Res 7:271–306
Conceição P (2019) Human development report 2019: beyond income, beyond averages, beyond today: inequalities in human development in the 21st century
Ferlay J, Colombet M, Soerjomataram, et al. (2018) Global and Regional Estimates of the Incidence and Mortality for 38 Cancers: GLOBOCAN 2018. International Agency for Research on Cancer/World Health Organization
Arnold M, Morgan E, Rumgay H, Mafra A, Singh D, Laversanne M, Soerjomataram I (2022) Current and future burden of breast cancer: Global statistics for 2020 and 2040. Breast 66:15–23
Ak MF (2020) A comparative analysis of breast cancer detection and diagnosis using data visualization and machine learning applications: In Healthcare (Vol. 8, No. 2, p. 111). MDPI
Ginsburg O, Yip CH, Brooks A, Cabanes A, Caleffi M, Dunstan Yataco JA, Anderson BO (2020) Breast cancer early detection: A phased approach to implementation. Cancer 126:2379–2393
Kourou K, Exarchos TP, Exarchos KP, Karamouzis MV, Fotiadis DI (2015) Machine learning applications in cancer prognosis and prediction. Comput Struct Biotechnol J 13:8–17
Bisen D, Shukla R, Rajpoot N, Maurya P, Uttam AK, Arjaria SK (2022) Responsive human-computer interaction model based on recognition of facial landmarks using machine learning algorithm. Multimed Tools Appl 81(13):18011–18031
Chang CC, Lin CJ (2011) LIBSVM: a library for support vector machines. ACM Trans Intell Syst Technol (TIST) 2(3):1–27
Chaubey G, Gavhane P R, Bisen D, Arjaria S K (2022) Customer purchasing behavior prediction using machine learning classification techniques: Journal of Ambient Intelligence and Humanized Computing, 1-25
Sultana J, Jilani A K (2018) Predicting breast cancer using logistic regression and multi-class classifiers: International Journal of Engineering & Technology, 7(4.20), 22-26
Alarabeyyat A, Alhanahnah M (2016) Breast cancer detection using k-nearest neighbor machine learning algorithm: In 2016 9th International Conference on Developments in eSystems Engineering (DeSE) (pp. 35-39). IEEE
Moh’d Rasoul A, Al-Gawagzeh MY, Alsaaidah BA (2012) Solving mammography problems of breast cancer detection using artificial neural networks and image processing techniques. Indian J Sci Technol 5(4):2520–2528
Saritas MM, Yasar A (2019) Performance analysis of ANN and Naive Bayes classification algorithm for data classification. Intl J Intell Syst Appl Eng 7(2):88–91
Khandezamin Z, Naderan M, Rashti MJ (2020) Detection and classification of breast cancer using logistic regression feature selection and GMDH classifier. J Biomed Inform 111:103591
Syafrudin M, Alfian G, Fitriyani NL, Anshari M, Hadibarata T, Fatwanto A, Rhee J (2020) A self-care prediction model for children with disability based on genetic algorithm and extreme gradient boosting. Mathematics 8(9):1590
Yu CS, Lin YJ, Lin CH, Lin SY, Wu JL, Chang SS (2020) Development of an online health care assessment for preventive medicine: a machine learning approach. J Med Internet Res 22(6):e18585
Al Azzam N, Shatnawi I (2021) Comparing supervised and semi-supervised machine learning models on diagnosing breast cancer. Ann Med Surg 62:53–64
Park EY, Yi M, Kim HS, Kim H (2021) A decision tree model for breast reconstruction of women with breast cancer a mixed method approach. Int J Environ Res Public Health 18(7):3579
Khatun T, Utsho M M R, Islam M A, Zohura M F, Hossen M S, Rimi R A, Anni S J (2021) Performance Analysis of Breast Cancer: A Machine Learning Approach, In 2021 Third International Conference on Inventive Research in Computing Applications (ICIRCA) (pp. 1426-1434). IEEE
Bicchierai G, Di Naro F, De Benedetto D, Cozzi D, Pradella S, Miele V, Nori J (2021) A review of breast imaging for timely diagnosis of disease: International Journal of Environmental Research and Public Health, 18(11), 5509
Rasool A, Bunterngchit C, Tiejian L, Islam MR, Qu Q, Jiang Q (2022) Improved machine learning-based predictive models for breast cancer diagnosis. Int J Environ Res Public Health 19(6):3211
Ogundokun RO, Misra S, Douglas M, Damaševičius R, Maskeliūnas R (2022) Medical internet-of-things based breast cancer diagnosis using hyperparameter-optimized neural networks. Future Internet 14(5):153
Olatunde O S, Mofiyinfoluwa O, Akande O N, Misra S, Ahuja R, Agrawal A, Oluranti J (2022) Comparison of Selected Algorithms on Breast Cancer Classification: In Advances in Electrical and Computer Technologies, Select Proceedings of ICAECT 2021 (pp. 161-171). Singapore: Springer Nature Singapore
Abiodun M K, Misra S, Awotunde J B, Adewole S, Joshua A, Oluranti J (2022) Comparing the performance of various supervised machine learning techniques for early detection of breast cancer: In Hybrid Intelligent Systems, 21st International Conference on Hybrid Intelligent Systems (HIS 2021), December 14–16, 2021 (pp. 473-482). Cham: Springer International Publishing
Panda R, Dash S, Padhy S, Das R K (2022) Diabetes Mellitus Prediction Through Interactive Machine Learning Approaches: In Next Generation of Internet of Things, Proceedings of ICNGIoT 2022 (pp. 143-152). Singapore: Springer Nature Singapore
Bejnordi B E, Veta M, Van Diest P J, Van Ginneken B, Karssemeijer N, Litjens G, CAMELYON16 Consortium (2017) Diagnostic assessment of deep learning algorithms for detection of lymph node metastases in women with breast cancer: Jama, 318(22), 2199-2210.
Khorshid SF, Abdulazeez AM (2021) Breast cancer diagnosis based on k-nearest neighbors, a review. PalArch's J Archaeol Egypt/Egyptol 18(4):1927–1951
Xu J, Xiang L, Liu Q, Gilmore H, Wu J, Tang J, Madabhushi A (2015) Stacked sparse autoencoder (SSAE) for nuclei detection on breast cancer histopathology images. IEEE Trans Med Imaging 35(1):119–130
Rakhlin A, Shvets A, Iglovikov V, Kalinin A A (2018) Deep convolutional neural networks for breast cancer histology image analysis: In Image Analysis and Recognition, 15th International Conference, ICIAR 2018, Póvoa de Varzim, Portugal, June 27–29, 2018, Proceedings 15 (pp. 737-744). Springer International Publishing
Islam MM, Haque MR, Iqbal H, Hasan MM, Hasan M, Kabir MN (2020) Breast cancer prediction: a comparative study using machine learning techniques. SN Comput Sci 1:1–14
Rodríguez-Ruiz A, Krupinski E, Mordang JJ, Schilling K, Heywang-Köbrunner SH, Sechopoulos I, Mann RM (2019) Detection of breast cancer with mammography: effect of an artificial intelligence support system. Radiology 290(2):305–314
Chaitanya V, Killedar S M, Revankar D, Pushpa M S (2019) Recognition and prediction of breast cancer using supervised diagnosis: In 2019 4th International Conference on Recent Trends on Electronics, Information, Communication & Technology (RTEICT) (pp. 1436-1441). IEEE
Burt JR, Torosdagli N, Khosravan N, RaviPrakash H, Mortazi A, Tissavirasingham F, Bagci U (2018) Deep learning beyond cats and dogs: recent advances in diagnosing breast cancer with deep neural networks. Br J Radiol 91(1089):20170545
Laghmati S, Cherradi B, Tmiri A, Daanouni O, Hamida S (2020) Classification of patients with breast cancer using neighbourhood component analysis and supervised machine learning techniques: In 2020 3rd International Conference on Advanced Communication Technologies and Networking (CommNet) (pp. 1-6). IEEE
Amrane M, Oukid S, Gagaoua I, Ensari T (2018) Breast cancer classification using machine learning: In 2018 electric electronics, computer science, biomedical engineerings' meeting (EBBT) (1-4). IEEE
Gbenga DE, Christopher N, Yetunde DC, Maiduguri N (2017) Performance comparison of machine learning techniques for breast cancer detection. Nova 6(1):1–8
Htay T T, Maung S S (2018) Early stage breast cancer detection system using glcm feature extraction and k-nearest neighbor (k-NN) on mammography image: In 2018 18th International Symposium on Communications and Information Technologies (ISCIT) (171-175). IEEE
Saraswathi D, Srinivasan E (2017) Performance analysis of mammogram CAD system using SVM and KNN classifier: In 2017 International Conference on Inventive Systems and Control (ICISC) (1-5). IEEE
Funding
The Author(s) declares that this research has not been funded by any agency.
Author information
Authors and Affiliations
Contributions
Varshali Jaiswal: Methodology; Writing original draft.
Preetam Suman: Literature Review; Editing; Reviewing the Manuscript.
Dhananjay Bisen: Reviewing the Manuscript & final drafting.
Corresponding author
Ethics declarations
The Author(s) permits the publisher to publish this research article in this esteemed Journal. The Corresponding Author gives consent from individuals to publish the data associated with this research article.
Ethical approval
All the author(s) agrees to the guidelines provided by the Journal. This research has considered all ethical issues associated with the methodology being employed.
Conflict of interest
The authors declare that they have no conflict of interest.
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Jaiswal, V., Suman, P. & Bisen, D. An improved ensembling techniques for prediction of breast cancer tissues. Multimed Tools Appl 83, 31975–32000 (2024). https://doi.org/10.1007/s11042-023-16949-8
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-023-16949-8