Skip to main content

Advertisement

Log in

Comparative analysis of classification algorithms on the breast cancer recurrence using machine learning

  • Original Article
  • Published:
Medical & Biological Engineering & Computing Aims and scope Submit manuscript

Abstract

This paper presents a comparative evaluation of classification algorithms using Waikato Environment for Knowledge Analysis (WEKA) software. The main goal of the paper is to conduct a comprehensive comparison and determine which predictive modelling technique is best for the problem of classifying breast cancer recurrence. The dataset for this study consists of 286 instances (201 instances belong to recurrence class and 85 instances belong to non-recurrence class) and 10 attributes. Comparison analysis is conducted for Naïve Bayes, J48, K*, Random Forest, Multilayer Perceptron (MLP) and Support Vector Machine (SVM) models using different parameters. The performance of the developed models is calculated using the following evaluation metrics: accuracy, precision, sensitivity, specificity, mean absolute error, ROC curves and AUC values. Contribution of the attributes to the classification models is assessed by measuring information gain. Results show that J48 model and the SVM algorithm give the highest accuracy, which is 75.5% and 79.6%, respectively. Implementation of SVM algorithm also shows the highest sensitivity of 99%, while the highest precision is obtained by MLP algorithm which is 79%. In addition, SVM algorithm possesses the lowest mean absolute error. Furthermore, by measuring information gain, it is revealed that a degree of malignant tumour contributes more than other attributes to recurrence of breast cancer.

Graphical abstract

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

Similar content being viewed by others

References

  1. Fayyad U, Piatetsky-Shapiro G, Smyth P (1996) From data mining to knowledge discovery in databases. AI Magazine 17(3):37–37

    Google Scholar 

  2. Ferlay J, Soerjomataram I, Dikshit R, Eser S, Mathers C, Rebelo M, Parkin DM, Forman D, Bray F (2015) Cancer incidence and mortality worldwide: sources, methods and major patterns in globocan 2012. International Journal of Cancer 136(5):E359–E386

    Article  CAS  PubMed  Google Scholar 

  3. Gerber B, Freund M, Reimer T (2010) Recurrent breast cancer: treatment strategies for maintaining and prolonging good quality of life. Deutsches Arzteblatt International 107(6):85

    PubMed  PubMed Central  Google Scholar 

  4. Cruz JA, Wishart DS (2006) Applications of machine learning in cancer prediction and prognosis. Cancer Informatics 2:117693510600200030

    Article  Google Scholar 

  5. Jalalian A, Mashohor SB, Mahmud HR, Saripan MIB, Ramli ARB, Karasfi B (2013) Computer-aided detection/diagnosis of breast cancer in mammography and ultrasound: a review. Clinical Imaging 37(3):420–426

    Article  PubMed  Google Scholar 

  6. Pradeep N, Girisha H, Karibasappa K (2012) Segmentation and feature extraction of tumors from digital mammograms. Computer Engineering and Intelligent Systems 3(4):37–46

    Google Scholar 

  7. Sheth D, Giger ML (2020) Artificial intelligence in the interpretation of breast cancer on mri. Journal of Magnetic Resonance Imaging 51(5):1310–1324

    Article  PubMed  Google Scholar 

  8. Padmapriya B, Velmurugan T (2016) Classification algorithm based analysis of breast cancer data. International Journal of Data Mining Techniques and Applications 6(1):43–49

    Article  Google Scholar 

  9. Yue W, Wang Z, Chen H, Payne A, Liu X (2018) Machine learning with applications in breast cancer diagnosis and prognosis. Designs 2(2):13

    Article  Google Scholar 

  10. Sadoughi F, Kazemy Z, Hamedan F, Owji L, Rahmanikatigari M, Azadboni TT (2018) Artificial intelligence methods for the diagnosis of breast cancer by image processing: a review. Breast Cancer: Targets and Therapy 10:219

    Google Scholar 

  11. Yavuz E, Eyupoglu C (2020) An effective approach for breast cancer diagnosis based on routine blood analysis features. Medical & Biological Engineering & Computing 58:1583–1601

    Article  Google Scholar 

  12. Soltani bm, Rahpeima R, Moradi Kashk F (2019) Breast cancer diagnosis with a microwave thermoacoustic imaging technique-a numerical approach. Medical & Biological Engineering & Computing 57(7):1497–1513

    Article  CAS  Google Scholar 

  13. Ghosh S, Mondal S, Ghosh B (2014) A comparative study of breast cancer detection based on svm and mlp bpn classifier. In: 2014 First International Conference on Automation, Control, Energy and Systems (ACES). IEEE, pp 1–4

  14. Abreu PH, Santos MS, Abreu MH, Andrade B, Silva DC (2016) Predicting breast cancer recurrence using machine learning techniques: a systematic review. ACM Computing Surveys (CSUR) 49(3):1–40

    Article  Google Scholar 

  15. Akinsola A, Sokunbi M, Okikiola F, Onadokun I (2017) Data mining for breast cancer classification. International Journal of Engineering And Computer Science 6(7):22 250-22 258

    Google Scholar 

  16. Sharma A, Kulshrestha S, Daniel S (2017) Machine learning approaches for breast cancer diagnosis and prognosis. In: 2017 International conference on soft computing and its engineering applications (icSoftComp). IEEE, pp 1–5

  17. Ahmad LG, Eshlaghy A, Poorebrahimi A, Ebrahimi M, Razavi A et al (2013) Using three machine learning techniques for predicting breast cancer recurrence. J Health Med Inform 4(124):3

    Google Scholar 

  18. Razavi AR, Gill H, Stål O, Sundquist M, Thorstenson S, Åhlfeldt H, Shahsavar N (2005) Exploring cancer register data to find risk factors for recurrence of breast cancer-application of canonical correlation analysis. BMC Medical Informatics and Decision Making 5(1):1–7

    Article  Google Scholar 

  19. Fan Q, Zhu C-J, Yin L (2010) Predicting breast cancer recurrence using data mining techniques, In: 2010 International conference on bioinformatics and biomedical technology. IEEE, pp 310–311

  20. Alzu’bi A, Najadat H, Doulat W, Al-Shari O, Zhou L (2021) Predicting the recurrence of breast cancer using machine learning algorithms. Multimedia Tools and Applications 80(9):13 787-13 800

    Article  Google Scholar 

  21. Lou S-J, Hou M-F, Chang H-T, Chiu C-C, Lee H-H, Yeh S-CJ, Shi H-Y (2020) Machine learning algorithms to predict recurrence within 10 years after breast cancer surgery: A prospective cohort study. Cancers 12(12):3817

    Article  PubMed Central  Google Scholar 

  22. Weka (2021) [Online]. Available: https://www.cs.waikato.ac.nz/ml/weka/

  23. Nie Y, De Santis L, Carratù M, O’Nils M, Sommella P, Lundgren J (2020) Deep melanoma classification with k-fold cross-validation for process optimization. In: 2020 IEEE International symposium on medical measurements and applications (MeMeA). IEEE, pp 1–6

  24. Salehi M, Razmara J, Lotfi S (2020) A novel data mining on breast cancer survivability using mlp ensemble learners. The Computer Journal 63(3):435–447

    Article  Google Scholar 

  25. Bawah F, Ussiph N (2018) Appraisal of the classification technique in data mining of student performance using j48 decision tree, k-nearest neighbor and multilayer perceptron algorithms. International Journal of Computer Applications 179:39–46

    Google Scholar 

  26. Cleary JG, Trigg LE (1995) K*: An instance-based learner using an entropic distance measure, In: Machine learning proceedings 1995. Elsevier, pp. 108–114

  27. Tiwari P, Dao H, Nguyen GN (2017) Performance evaluation of lazy, decision tree classifier and multilayer perceptron on traffic accident analysis. Informatica 41(1):39–46

    Google Scholar 

  28. Foody GM, Mathur A (2004) Toward intelligent training of supervised image classifications: directing training data acquisition for svm classification. Remote Sensing of Environment 93(1–2):107–117

    Article  Google Scholar 

  29. Korting TS (2006) C4. 5 algorithm and multivariate decision trees. Image Processing Division, National Institute for Space Research–INPE Sao Jose dos Campos–SP, Brazil

  30. Murugan S, Kumar BM, Amudha S (2017) Classification and prediction of breast cancer using linear regression, decision tree and random forest, In: 2017 International Conference on Current Trends in Computer, Electrical, Electronics and Communication (CTCEEC). IEEE, pp 763–766

  31. Sammut C, Webb GI (2011) Encyclopedia of machine learning. Springer Science & Business Media,

  32. Uci breast cancer dataset (1995) [Online]. Available: https://archive.ics.uci.edu/ml/datasets/breast+cancer

  33. White MC, Holman DM, Boehm JE, Peipins LA, Grossman M, Henley SJ (2014) Age and cancer risk: a potentially modifiable relationship. American Journal of Preventive Medicine 46(3):S7–S15

    Article  PubMed  PubMed Central  Google Scholar 

  34. Jerez-Aragonés JM, Gómez-Ruiz JA, Ramos-Jiménez G, Muñoz-Pérez J, Alba-Conejo E (2003) A combined neural network and decision trees model for prognosis of breast cancer relapse. Artificial Intelligence in Medicine 27(1):45–63

    Article  PubMed  Google Scholar 

  35. Trichopoulos D, MacMahon B, Cole P (1972) Menopause and breast cancer risk. Journal of the National Cancer Institute 48(3):605–613

    CAS  PubMed  Google Scholar 

  36. Abdou Y, Gupta M, Asaoka M, Attwood K, Mateusz O, Gandhi S, Takabe K (2020) Abstract p2-09-09: Breast cancer arising on the left side is biologically more aggressive and has worse outcomes compared to the right side

  37. Siotos C, McColl M, Psoter K, Gilmore RC, Sebai ME, Broderick KP, Jacobs LK, Irwin S, Rosson GD, Habibi M (2018) Tumor site and breast cancer prognosis. Clinical Breast Cancer 18(5):e1045–e1052

    Article  PubMed  Google Scholar 

  38. Dias JG (2009) Breast cancer diagnostic typologies by grades of membership fuzzy modeling, In: Proceedings of the 2nd WSEAS international conference on multivariate analysis and its application in science and engineering

  39. Group EBCTC et al (2011) Effect of radiotherapy after breast-conserving surgery on 10-year recurrence and 15-year breast cancer death: meta-analysis of individual patient data for 10 801 women in 17 randomised trials. The Lancet 378(9804):1707-1716,

    Article  Google Scholar 

  40. Merino T, Ip T, Domínguez F, Acevedo F, Medina L, Villaroel A, Camus M, Vinés E, Sánchez C (2018) Risk factors for loco-regional recurrence in breast cancer patients: a retrospective study. Oncotarget 9(54):30355

    Article  PubMed  PubMed Central  Google Scholar 

  41. Hess KR, Esteva FJ (2013) Effect of her2 status on distant recurrence in early stage breast cancer. Breast Cancer Research and Treatment. 137(2):449–455

    Article  CAS  PubMed  Google Scholar 

  42. Fedele P, Orlando L, Schiavone P, Quaranta A, Lapolla AM, De Pasquale M, Ardizzone A, Bria E, Sperduti I, Calvani N et al (2014) Bmi variation increases recurrence risk in women with early-stage breast cancer. Future Oncology 10(15):2459–2468

    Article  CAS  PubMed  Google Scholar 

  43. Guleria K, Sharma A, Lilhore UK, Prasad D (2020) Breast cancer prediction and classification using supervised learning techniques. Journal of Computational and Theoretical Nanoscience 17(6):2519–2522

    Article  CAS  Google Scholar 

  44. Cervantes J, Garcia-Lamont F, Rodríguez-Mazahua L, Lopez A (2020) A comprehensive survey on support vector machine classification: Applications, challenges and trends. Neurocomputing 408:189–215

    Article  Google Scholar 

Download references

Acknowledgements

This breast cancer domain was obtained from the University Medical Centre, Institute of Oncology, Ljubljana, Yugoslavia. Thanks go to M. Zwitter and M. Soklic for providing the data. Please include this citation if you plan to use this database.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Gholamreza Anbarjafari.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Mikhailova, V., Anbarjafari, G. Comparative analysis of classification algorithms on the breast cancer recurrence using machine learning. Med Biol Eng Comput 60, 2589–2600 (2022). https://doi.org/10.1007/s11517-022-02623-y

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11517-022-02623-y

Keywords

Navigation