Abstract
The advances in the Machine Learning (ML) domain, from pattern recognition to computational statistical learning, have increased its utility for breast cancer as well by contributing to the screening strategy of diverse risk factors with complex relationships and personalized early prediction. In this work, we focused on Ensemble ML models after using the synthetic minority oversampling technique (SMOTE) with 10-fold cross-validation. Models were compared in terms of precision, accuracy, recall and area under the curve (AUC). After the experimental evaluation, the model that prevailed over the others was the Rotation Forest achieving accuracy, precision and recall equal to 82% and an AUC of 87.4%.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Breast cancer. https://www.who.int/news-room/fact-sheets/detail/breast-cancer. Accessed 1 Apr 2023
UCI Ml repository. https://archive.ics.uci.edu/ml/datasets/Breast+Cancer+Coimbra. Accessed 1 Apr 2023
Weka. https://www.weka.io/. Accessed 1 Apr 2023
Ahmad, A.: Breast cancer statistics: recent trends. Breast cancer metastasis and drug resistance: challenges and progress, pp. 1–7 (2019)
Ahmad, L.G., Eshlaghy, A., Poorebrahimi, A., Ebrahimi, M., Razavi, A., et al.: Using three machine learning techniques for predicting breast cancer recurrence. J. Health Med. Inform. 4(124), 3 (2013)
Alexiou, S., Dritsas, E., Kocsis, O., Moustakas, K., Fakotakis, N.: An approach for personalized continuous glucose prediction with regression trees. In: 2021 6th South-East Europe Design Automation, Computer Engineering, Computer Networks and Social Media Conference (SEEDA-CECNSM), pp. 1–6. IEEE (2021)
Alfian, G., et al.: Predicting breast cancer from risk factors using SVM and extra-trees-based feature selection method. Computers 11(9), 136 (2022)
Amrane, M., Oukid, S., Gagaoua, I., Ensari, T.: Breast cancer classification using machine learning. In: 2018 electric electronics, computer science, biomedical engineerings’ meeting (EBBT), pp. 1–4. IEEE (2018)
Billena, C., et al.: 10-year breast cancer outcomes in women \(\le \) 35 years of age. Int. J Rad. Oncol. Biol. Phys. 109(4), 1007–1018 (2021)
Chawla, N.V., Bowyer, K.W., Hall, L.O., Kegelmeyer, W.P.: Smote: synthetic minority over-sampling technique. J. Artif. Intell. Res. 16, 321–357 (2002)
Dritsas, E., Alexiou, S., Moustakas, K.: COPD severity prediction in elderly with ml techniques. In: Proceedings of the 15th International Conference on PErvasive Technologies Related to Assistive Environments, pp. 185–189 (2022)
Dritsas, E., Alexiou, S., Konstantoulas, I., Moustakas, K.: Short-term glucose prediction based on oral glucose tolerance test values. In: HEALTHINF, pp. 249–255 (2022)
Dritsas, E., Alexiou, S., Moustakas, K.: Cardiovascular disease risk prediction with supervised machine learning techniques. In: ICT4AWE, pp. 315–321 (2022)
Dritsas, E., Alexiou, S., Moustakas, K.: Efficient data-driven machine learning models for hypertension risk prediction. In: 2022 International Conference on INnovations in Intelligent SysTems and Applications (INISTA), pp. 1–6. IEEE (2022)
Dritsas, E., Alexiou, S., Moustakas, K.: Metabolic syndrome risk forecasting on elderly with ML techniques. In: Learning and Intelligent Optimization: 16th International Conference, LION 16, Milos Island, Greece, June 5–10, 2022, Revised Selected Papers, pp. 460–466. Springer, Cham (2023). https://doi.org/10.1007/978-3-031-24866-5_33
Dritsas, E., Fazakis, N., Kocsis, O., Fakotakis, N., Moustakas, K.: Long-term hypertension risk prediction with ML techniques in ELSA database. In: Simos, D.E., Pardalos, P.M., Kotsireas, I.S. (eds.) LION 2021. LNCS, vol. 12931, pp. 113–120. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-92121-7_9
Dritsas, E., Fazakis, N., Kocsis, O., Moustakas, K., Fakotakis, N.: Optimal team pairing of elder office employees with machine learning on synthetic data. In: 2021 12th International Conference on Information, Intelligence, Systems & Applications (IISA), pp. 1–4. IEEE (2021)
Dritsas, E., Trigka, M.: Data-driven machine-learning methods for diabetes risk prediction. Sensors 22(14), 5304 (2022)
Dritsas, E., Trigka, M.: Lung cancer risk prediction with machine learning models. Big Data Cognitive Comput. 6(4), 139 (2022)
Dritsas, E., Trigka, M.: Machine learning methods for hypercholesterolemia long-term risk prediction. Sensors 22(14), 5365 (2022)
Dritsas, E., Trigka, M.: Machine learning techniques for chronic kidney disease risk prediction. Big Data Cognitive Comput. 6(3), 98 (2022)
Dritsas, E., Trigka, M.: Stroke risk prediction with machine learning techniques. Sensors 22(13), 4670 (2022)
Dritsas, E., Trigka, M.: Supervised machine learning models to identify early-stage symptoms of sars-cov-2. Sensors 23(1), 40 (2022)
Dritsas, E., Trigka, M.: Efficient data-driven machine learning models for cardiovascular diseases risk prediction. Sensors 23(3), 1161 (2023)
Dritsas, E., Trigka, M.: Supervised machine learning models for liver disease risk prediction. Computers 12(1), 19 (2023)
Fahad Ullah, M.: Breast cancer: current perspectives on the disease status. Breast Cancer Metastasis and Drug Resistance: Challenges and Progress, pp. 51–64 (2019)
Fazakis, N., Dritsas, E., Kocsis, O., Fakotakis, N., Moustakas, K.: Long-term cholesterol risk prediction using machine learning techniques in elsa database. In: IJCCI, pp. 445–450 (2021)
Fazakis, N., Kocsis, O., Dritsas, E., Alexiou, S., Fakotakis, N., Moustakas, K.: Machine learning tools for long-term type 2 diabetes risk prediction. IEEE Access 9, 103737–103757 (2021)
Gordon, P.B.: The impact of dense breasts on the stage of breast cancer at diagnosis: a review and options for supplemental screening. Curr. Oncol. 29(5), 3595–3636 (2022)
Gucalp, A., et al.: Male breast cancer: a disease distinct from female breast cancer. Breast Cancer Res. Treat. 173, 37–48 (2019)
Hossin, M., Sulaiman, M.N.: A review on evaluation metrics for data classification evaluations. Int. J. Data Mining Knowl. Manage. Process 5(2), 1 (2015)
Islam, M.M., Haque, M.R., Iqbal, H., Hasan, M.M., Hasan, M., Kabir, M.N.: Breast cancer prediction: a comparative study using machine learning techniques. SN Comput. Sci. 1, 1–14 (2020)
Jafari, S.H., et al.: Breast cancer diagnosis: imaging techniques and biochemical markers. J. Cellular Physiol. 233(7), 5200–5213 (2018)
Johansson, A.L., Trewin, C.B., Hjerkind, K.V., Ellingjord-Dale, M., Johannesen, T.B., Ursin, G.: Breast cancer-specific survival by clinical subtype after 7 years follow-up of young and elderly women in a nationwide cohort. Int. J. Cancer 144(6), 1251–1261 (2019)
Kabari, L.G., Onwuka, U.C.: Comparison of bagging and voting ensemble machine learning algorithm as a classifier. Int. J. Adv. Res. Comput. Sci. Softw. Eng. 9(3), 19–23 (2019)
Konstantoulas, I., Dritsas, E., Moustakas, K.: Sleep quality evaluation in rich information data. In: 2022 13th International Conference on Information, Intelligence, Systems & Applications (IISA), pp. 1–4. IEEE (2022)
Konstantoulas, I., Kocsis, O., Dritsas, E., Fakotakis, N., Moustakas, K.: Sleep quality monitoring with human assisted corrections. In: IJCCI, pp. 435–444 (2021)
Lee, K., Kruper, L., Dieli-Conwright, C.M., Mortimer, J.E.: The impact of obesity on breast cancer diagnosis and treatment. Curr. Oncol. Rep. 21, 1–6 (2019)
Li, H., et al.: Alcohol consumption, cigarette smoking, and risk of breast cancer for brca1 and brca2 mutation carriers: results from the brca1 and brca2 cohort consortium. Cancer Epidemiol. Biomarkers Prev. 29(2), 368–378 (2020)
Liu, Y., Wang, Y., Zhang, J.: New machine learning algorithm: random forest. In: Liu, B., Ma, M., Chang, J. (eds.) ICICA 2012. LNCS, vol. 7473, pp. 246–252. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-34062-8_32
Mokhatri-Hesari, P., Montazeri, A.: Health-related quality of life in breast cancer patients: review of reviews from 2008 to 2018. Health Qual. Life Outcomes 18, 1–25 (2020)
Naji, M.A., El Filali, S., Aarika, K., Benlahmar, E.H., Abdelouhahid, R.A., Debauche, O.: Machine learning algorithms for breast cancer prediction and diagnosis. Procedia Comput. Sci. 191, 487–492 (2021)
Nusinovici, S., et al.: Logistic regression was as good as machine learning for predicting major chronic diseases. J. Clin. Epidemiol. 122, 56–69 (2020)
Olsson, H.L., Olsson, M.L.: The menstrual cycle and risk of breast cancer: a review. Front. Oncol. 10, 21 (2020)
Patrício, M., et al.: Using resistin, glucose, age and BMI to predict the presence of breast cancer. BMC Cancer 18(1), 1–8 (2018)
Posonia, A.M., Vigneshwari, S., Rani, D.J.: Machine learning based diabetes prediction using decision tree j48. In: 2020 3rd International Conference on Intelligent Sustainable Systems (ICISS), pp. 498–502. IEEE (2020)
Riggio, A.I., Varley, K.E., Welm, A.L.: The lingering mysteries of metastatic recurrence in breast cancer. British J. Cancer 124(1), 13–26 (2021)
Rodriguez, J.J., Kuncheva, L.I., Alonso, C.J.: Rotation forest: a new classifier ensemble method. IEEE Trans. Pattern Anal. Mach. Intell. 28(10), 1619–1630 (2006)
Sagi, O., Rokach, L.: Ensemble learning: a survey. Wiley Interdisc. Rev.: Data Min. Knowl. Discov 8(4), e1249 (2018)
Satapathy, S.K., Bhoi, A.K., Loganathan, D., Khandelwal, B., Barsocchi, P.: Machine learning with ensemble stacking model for automated sleep staging using dual-channel EEG signal. Biomed. Signal Process. Control 69, 102898 (2021)
Trigka, M., Dritsas, E.: Long-term coronary artery disease risk prediction with machine learning models. Sensors 23(3), 1193 (2023)
Wang, L.: Early diagnosis of breast cancer. Sensors 17(7), 1572 (2017)
Acknowledgements
This research was funded by the European Union and Greece (Partnership Agreement for the Development Framework 2014–2020) under the Regional Operational Programme Ionian Islands 2014–2020, project title: “Indirect costs for project “Smart digital applications and tools for the effective promotion and enhancement of the Ionian Islands bio-diversity” ”, project number: 5034557.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 IFIP International Federation for Information Processing
About this paper
Cite this paper
Dritsas, E., Trigka, M., Mylonas, P. (2023). Ensemble Machine Learning Models for Breast Cancer Identification. In: Maglogiannis, I., Iliadis, L., Papaleonidas, A., Chochliouros, I. (eds) Artificial Intelligence Applications and Innovations. AIAI 2023 IFIP WG 12.5 International Workshops. AIAI 2023. IFIP Advances in Information and Communication Technology, vol 677. Springer, Cham. https://doi.org/10.1007/978-3-031-34171-7_24
Download citation
DOI: https://doi.org/10.1007/978-3-031-34171-7_24
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-34170-0
Online ISBN: 978-3-031-34171-7
eBook Packages: Computer ScienceComputer Science (R0)