Skip to main content

Advertisement

Log in

Predicting pathological response to neoadjuvant chemotherapy in breast cancer patients based on imbalanced clinical data

  • Original Article
  • Published:
Personal and Ubiquitous Computing Aims and scope Submit manuscript

Abstract

Neoadjuvant chemotherapy (NAC) may help some breast cancer patients with subsequent surgery or radiotherapy. However, there are certain risks associated with NAC. To lower the risks, machine-learning methods can be used to assist the diagnosis of breast tumors based on clinical data. This study investigated the use of ensemble machine-learning models in the prediction of pathological response to NAC for breast cancer patients with actual clinical data. The ensemble k-nearest neighbor (EKNN) model was determined to predict pathological responses. The imbalanced clinical data of patients with NAC were reviewed retrospectively, and 11 clinicopathological variables were selected from all features to establish succinct EKNN model. A total of 259 patients’ clinical data was included in the model. The training and testing set for each single k-nearest neighbor (KNN) contained 27 and 9 patients, respectively. A total of 259 breast cancer patients in the database included 36 cases of pathological complete response, 157 cases of partial response, and 66 cases of stable disease. To solve the imbalanced clinical data problem, an ensemble-learning EKNN was designed, where the number of samples for each class in a base learner is set to equal to the minimum number 36. It showed that the classification accuracy of pathological response for breast cancer patients after NAC was 81.48% by EKNN model and the Kappa coefficient was 0.72, indicating that the robustness and generalization were better than the average prediction ability of single KNN model (average accuracy of single KNN model was 62.22% and Kappa coefficient was 0.43). Based on actual clinical data, important clinicopathological variables are selected, and the imbalanced problem are well solved by the ensemble EKNN model. The model improved the robustness and generalization for predicting the pathological response with imbalanced clinical data. It suggested that ensemble machine learning has possible practical applications for assisting cancer stage diagnoses and precision medicine.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2

Similar content being viewed by others

References

  1. Telli ML (2013) Insight or confusion: survival after response-guided neoadjuvant chemotherapy in breast cancer. J Clin Oncol 31:3613–3615

    Article  Google Scholar 

  2. Cho N, Im SA, Park IA, Lee KH, Li M, Han W, Noh DY, Moon WK (2014) Breast cancer: early prediction of response to neoadjuvant chemotherapy using parametric response maps for MR imaging. Radiology 26:385–396

    Article  Google Scholar 

  3. Lee HJ, Seo JY, Ahn JH, Ahn SH, Gong G (2013) Tumor-associated lymphocytes predict response to neoadjuvant chemotherapy in breast cancer patients. J Breast Cancer 16:32–39

    Article  Google Scholar 

  4. Romero A, García-Sáenz JA, Fuentes-Ferrer M, López Garcia-Asenjo JA, Furió V, Román JM, Moreno A, de la Hoya M, Díaz-Rubio E, Martín M, Caldés T (2013) Correlation between response to neoadjuvant chemotherapy and survival in locally advanced breast cancer patients. Ann Oncol 24:655–661

    Article  Google Scholar 

  5. Kim MM, Allen P, Gonzalez-Angulo AM, Woodward WA, Meric-Bernstam F, Buzdar AU, Hunt KK, Kuerer HM, Litton JK, Hortobagyi GN, Buchholz TA, Mittendorf EA (2013) Pathologic complete response to neoadjuvant chemotherapy with trastuzumab predicts for improved survival in women with HER2-overexpressing breast cancer. Ann Oncol 24:1999–2004

    Article  Google Scholar 

  6. Dent S, Oyan B, Honig A, Mano M, Howell S (2013) HER2-targeted therapy in breast cancer: a systematic review of neoadjuvant trials. Cancer Treat Rev 39:622–631

    Article  Google Scholar 

  7. Glück S, de Snoo F, Peeters J, Stork-Sloots L, Somlo G (2013) Molecular subtyping of early-stage breast cancer identifies a group of patients who do not benefit from neoadjuvant chemotherapy. Breast Cancer Res Treat 139:759–767

    Article  Google Scholar 

  8. Tan SH, Lee SC (2012) An update on chemotherapy and tumor gene expression profiles in breast cancer. Expert Opin Drug Metab Toxicol 8:1083–1113

    Article  Google Scholar 

  9. Wu L, Yao L, Zhang H, Tao O, Li J, Wang T, Fan Z, Fan T, Lin B, Yin CC, Xie Y (2015) A genome-wide association study identifies WT1 variant with better response to 5-fluorouracil, pirarubicin and cyclophosphamide neoadjuvant chemotherapy in breast cancer patients. Hematopathology 7:5042–5052

    Google Scholar 

  10. Tewari M, Pradhan S, Singh U, Singh TB, Shukla HS (2010) Assessment of predictive markers of response to neoadjuvant chemotherapy in breast cancer. Asian J Surg 33:157–167

    Article  Google Scholar 

  11. Von Minckwitz G, Untch M, Nüesch E, Loibl S, Kaufmann M, Kümmel S, Fasching PA, Eiermann W, Blohmer JU, Costa SD, Mehta K, Hilfrich J, Jackisch C, Gerber B, du Bois A, Huober J, Hanusch C, Konecny G, Fett W, Stickeler E, Harbeck N, Müller V, Jüni P (2011) Impact of treatment characteristics on response of different breast cancer phenotypes: pooled analysis of the German neo-adjuvant chemotherapy trials. Breast Cancer Res Treat 125:145–156

    Article  Google Scholar 

  12. Pivot X, Mansi L, Chaigneau L, Montcuquet P, Thiery-Vuillemin A, Bazan F, Dobi E, Sautiere JL, Rigenbach F, Algros MP, Butler S, Jamshidian F, Febbo P, Svedman C, Paget-Bailly S, Bonnetain F, Villanueva C (2015) In the era of genomics, should tumor size be reconsidered as a criterion for neoadjuvant chemotherapy? Oncologist 20:344–350

    Article  Google Scholar 

  13. Ganesan K, Acharya UR, Chua CK, Min LC, Abraham KT, Ng KH (2013) Computer-aided breast cancer detection using mammograms: a review. IEEE Rev Biomed Eng 6:77–98

    Article  Google Scholar 

  14. Gao T, Li H, Li W, Li L, Fang C, Li H, Hu LH, Lu YH, Su ZM (2016) A machine learning correction for DFT non-covalent interactions based on the S22, S66 and X40 benchmark databases. J Cheminf 8:1–7

    Article  Google Scholar 

  15. Liu H, Liu L, Zhang H (2010) Ensemble gene selection for cancer classification. Pattern Recogn 43:2763–2772

    Article  Google Scholar 

  16. Jiang H, Yi S, Li J, Yang F, Hu X (2010) Ant clustering algorithm with K-harmonic means clustering. Expert Syst Appl 37:8679–8684

    Article  Google Scholar 

  17. Lee JK, Coutant C, Kim YC, Qi Y, Theodorescu D, Symmans WF, Baggerly K, Rouzier R, Pusztai L (2010) Prospective comparison of clinical and genomic multivariate predictors of response to neoadjuvant chemotherapy in breast cancer. Clin Cancer Res 16:711–718

    Article  Google Scholar 

  18. Colleoni M, Bagnardi V, Rotmensz N, Viale G, Mastropasqua M, Veronesi P, Cardillo A, Torrisi R, Luini A, Goldhirsch A (2010) A nomogram based on the expression of Ki-67, steroid hormone receptors status and number of chemotherapy courses to predict pathological complete remission after preoperative chemotherapy for breast cancer. Eur J Cancer 46:2216–2224

    Article  Google Scholar 

  19. Takada M, Sugimoto M, Ohno S, Kuroi K, Sato N, Bando H, Masuda N, Iwata H, Kondo M, Sasano H, Chow LW, Inamoto T, Naito Y, Tomita M, Toi M (2012) Predictions of the pathological response to neoadjuvant chemotherapy in patients with primary breast cancer using a data mining technique. Breast Cancer Res Treat Breast 134:661–670

    Article  Google Scholar 

  20. Mani S, Chen Y, Li X, Arlinghaus L, Chakravarthy AB, Abramson V, Bhave SR, Levy MA, Xu H, Yankeelov TE (2013) Machine learning for predicting the response of breast cancer to neoadjuvant chemotherapy. J Am Med Inform Assoc 20:688–695

    Article  Google Scholar 

  21. Sugimoto M, Takada M, Toi M (2014) Development of Web tools to predict axillary lymph node metastasis and pathological response to neoadjuvant chemotherapy in breast cancer patients. Int J Biol Markers 29:e372–e379

    Article  Google Scholar 

  22. Eisenhauer EA, Therasse P, Bogaerts J, Schwartz LH, Sargent D, Ford R, Dancey J, Arbuck S, Gwyther S, Mooney M, Rubinstein L, Shankar L, Dodd L, Kaplan R, Lacombe D, Verweij J (2009) New response evaluation criteria in solid tumours: revised RECIST guideline (version 1.1). Eur J Cancer 45:228–247

    Article  Google Scholar 

  23. Swati S, Ashok G, Prashant C (2017) Medical decision support system for extremely imbalanced datasets. Inf Sci 384:205–109

    Article  MathSciNet  Google Scholar 

  24. Bartosz K, Mikel G, Łukasz J, Francisco H (2016) Evolutionary undersampling boosting for imbalanced classification of breast cancer malignancy. Appl Soft Comput 38:714–726

    Article  Google Scholar 

  25. Kononenko I (1994) Estimating attributes: analysis and extension of relief. In: Bergadano F, Raedt L (eds) European conference on machine learning. Springer-Verlag, New York, p 171–182

    Chapter  Google Scholar 

  26. Kennard RW, Stone LA (1969) Computer aided design of experiments. Technometrics 11(1):137–148

    Article  Google Scholar 

  27. Gao T, Hu L, Jia Z, Xia T, Fang C, Li H, Hu L, Lu Y, Li H (2018) SPXYE: an improved method for partitioning training and validation sets. Clust Comput. https://doi.org/10.1007/s10586-018-1877-9

  28. Hamid P, Hosein A, Behrouz M (2009) International conference on advances in engineering technologies held at the world congress on engineering and computer sciences. San Francisco CA, AIP Conference Proceedings, 1127:153–161

  29. Chen R, Zhu M (2013) Semi-supervised k-nearest neighbor classification method. J Image Graph 18:195–200

    Google Scholar 

  30. Majnik M, Bosnić Z (2013) ROC analysis of classifiers in machine learning: a survey. Intell Data Anal 17:531–558. https://doi.org/10.3233/IDA-130592

  31. Hu C, Wang J, Zheng C, Xu S, Zhang H, Liang Y, Bi L, Fan Z, Han B, Xu W (2013) Raman spectra exploring breast tissues: comparison of principal component analysis and support vector machine-recursive feature elimination. Med Phys 40:063501

    Article  Google Scholar 

  32. Kong Y, Jing M (2012) Research of the classification method based on confusion matrixes and ensemble learning. Comput Eng Sci 34:111–117

    Google Scholar 

  33. Czodrowski P (2014) Count on kappa. J Comput Aided Mol Des 28:1049–1055

    Article  Google Scholar 

Download references

Funding

This study received financial support from NSFC (21473025 and 81773171), the Science and Technology Development Planning of Jilin Province (20150204041GX and 20160204044GX), and the Education Department of Jilin Province (2015552, 2014B045, 2015553 and 2015556).

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to LiHong Hu or Bing Han.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Gao, T., Hao, Y., Zhang, H. et al. Predicting pathological response to neoadjuvant chemotherapy in breast cancer patients based on imbalanced clinical data. Pers Ubiquit Comput 22, 1039–1047 (2018). https://doi.org/10.1007/s00779-018-1144-3

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00779-018-1144-3

Keywords

Navigation