Abstract
Early breast cancer recurrence is indicative of poor response to adjuvant therapy and poses threats to patients’ lives. Most existing prediction models for breast cancer recurrence are regression-based models and difficult to interpret. We apply a Decision Tree algorithm to the clinical information of a cohort of non-metastatic invasive breast cancer patients, to establish a classifier that categorizes patients based on whether they develop early recurrence and on similarities of their clinical and pathological diagnoses. The classifier predicts for whether a patient developed early disease recurrence; and is estimated to be about 70% accurate. For an independent validation cohort of 65 patients, the classifier predicts correctly for 55 patients. The classifier also groups patients based on intrinsic properties of their diseases; and for each subgroup lists the disease characteristics in a hierarchal order, according to their relevance to early relapse. Overall, it identifies pathological nodal stage, percentage of intra-tumor stroma and components of TGFβ-Smad signaling pathway as highly relevant factors for early breast cancer recurrence. Since most of the disease characteristics used by this classifier are results of standardized tests, routinely collected during breast cancer diagnosis, the classifier can easily be adopted in various research and clinical settings.
Similar content being viewed by others
References
Ahn, S., Cho, J., Sung, J., Lee, J. E., Nam, S. J., Kim, K. M., & Cho, E. Y. (2012). The prognostic significance of tumor-associated stroma in invasive breast carcinoma. Tumour biology : the Journal of the International Society for Oncodevelopmental Biology and Medicine, 33(5), 1573–1580.
Aubele, M., Auer, G., Voss, A., Falkmer, U., Rutquist, L., & Hofler, H. (1995). Disease-free survival of node-positive breast-cancer patients - improved prognostication by cytometric parameters. Pathology, Research and Practice, 191(10), 982–990.
Barton, S., Zabaglo, L., A'Hern, R., Turner, N., Ferguson, T., O'Neill, S., Hills, M., Smith, I., & Dowsett, M. (2012). Assessment of the contribution of the IHC4+C score to decision making in clinical practice in early breast cancer. British Journal of Cancer, 106(11), 1760–1765.
Brewster, A. M., Hortobagyi, G. N., Broglio, K. R., Kau, S. W., Santa-Maria, C. A., Arun, B., Buzdar, A. U., Booser, D. J., Valero, V., Bondy, M., & Esteva, F. J. (2008). Residual risk of breast cancer recurrence 5 years after adjuvant therapy. Journal of the National Cancer Institute, 100(16), 1179–1183.
Campbell, H. E., Gray, A. M., Harris, A. L., Briggs, A. H., & Taylor, M. A. (2010). Estimation and external validation of a new prognostic model for predicting recurrence-free survival for early breast cancer patients in the UK. British Journal of Cancer, 103(6), 776–786.
Carlson, R. (2010). Surveillance of patients following primary therapy. In Diseases of the breast, 4 edn. Lippincott Williams and Wilkins.
de Kruijf, E. M., van Nes, J. G., van de Velde, C. J., Putter, H., Smit, V. T., Liefers, G. J., Kuppen, P. J., Tollenaar, R. A., & Mesker, W. E. (2011). Tumor-stroma ratio in the primary tumor is a prognostic factor in early breast cancer patients, especially in triple-negative carcinoma patients. Breast Cancer Research and Treatment, 125(3), 687–696.
de Kruijf, E. M., Dekker, T. J., Hawinkels, L. J., Putter, H., Smit, V. T., Kroep, J. R., Kuppen, P. J., van de Velde, C. J., ten Dijke, P., Tollenaar, R. A., & Mesker, W. E. (2013). The prognostic role of TGF-beta signaling pathway in breast cancer patients. Annals of Oncology : Official Journal of the European Society for Medical Oncology / ESMO, 24(2), 384–390.
Dekker, T. J., van de Velde, C. J., van Pelt, G. W., Kroep, J. R., Julien, J. P., Smit, V. T., Tollenaar, R. A., & Mesker, W. E. (2013). Prognostic significance of the tumor-stroma ratio: Validation study in node-negative premenopausal breast cancer patients from the EORTC perioperative chemotherapy (POP) trial (10854). Breast Cancer Research and Treatment, 139(2), 371–379.
Downey, C. L., Simpkins, S. A., White, J., Holliday, D. L., Jones, J. L., Jordan, L. B., Kulka, J., Pollock, S., Rajan, S. S., Thygesen, H. H., Hanby, A. M., & Speirs, V. (2014). The prognostic significance of tumour-stroma ratio in oestrogen receptor-positive breast cancer. British Journal of Cancer, 110(7), 1744–1747.
Esposito, N. N., Dabbs, D. J., & Bhargava, R. (2009). Are encapsulated papillary carcinomas of the breast in situ or invasive? A basement membrane study of 27 cases. American Journal of Clinical Pathology, 131(2), 228–242.
Galea, M. H., Blamey, R. W., Elston, C. E., & Ellis, I. O. (1992). The Nottingham prognostic index in primary breast cancer. Breast Cancer Research and Treatment, 22(3), 207–219.
Gujam, F. J., Edwards, J., Mohammed, Z. M., Going, J. J., & McMillan, D. C. (2014). The relationship between the tumour stroma percentage, clinicopathological characteristics and outcome in patients with operable ductal breast cancer. British Journal of Cancer, 111(1), 157–165.
Huijbers, A., Tollenaar, R. A., van Pelt, G. W., Zeestraten, E. C., Dutton, S., McConkey, C. C., Domingo, E., Smit, V. T., Midgley, R., Warren, B. F., Johnstone, E. C., Kerr, D. J., & Mesker, W. E. (2013). The proportion of tumor-stroma as a strong prognosticator for stage II and III colon cancer patients: Validation in the VICTOR trial. Annals of Oncology : Official Journal of the European Society for Medical Oncology / ESMO, 24(1), 179–185.
Jerevall, P. L., Ma, X. J., Li, H., Salunga, R., Kesty, N. C., Erlander, M. G., Sgroi, D. C., Holmlund, B., Skoog, L., Fornander, T., Nordenskjold, B., & Stal, O. (2011). Prognostic utility of HOXB13:IL17BR and molecular grade index in early-stage breast cancer patients from the Stockholm trial. British Journal of Cancer, 104(11), 1762–1769.
Lebrun, J. J. (2012). The dual role of TGF in human cancer: From tumor suppression to cancer metastasis. ISRN Molecular Biology, 2012, 1–28.
Lee, H. M., & Hsu, C. C. (1990). A new model for concept classification based on linear threshold unit and decision tree. Proceedings of the International Joint Conference on Neural Networks (IJCNN-90-Wash D.C. IEEE/INNS), Washington, D.C., USA, vol. 2, pp. 631–634.
LM, M. S., Altman, D. G., Sauerbrei, W., Taube, S. E., Gion, M., Clark, G. M., & Statistics Subcommittee of the NCIEWGoCD. (2005). REporting recommendations for tumour MARKer prognostic studies (REMARK). European Journal of Cancer, 41(12), 1690–1696.
Ma, X. J., Salunga, R., Dahiya, S., Wang, W., Carney, E., Durbecq, V., Harris, A., Goss, P., Sotiriou, C., Erlander, M., & Sgroi, D. (2008). A five-gene molecular grade index and HOXB13:IL17BR are complementary prognostic factors in early stage breast cancer. Clinical Cancer Research: An Official Journal of the American Association for Cancer Research, 14(9), 2601–2608.
Massague, J. (2008). TGFbeta in cancer. Cell, 134(2), 215–230.
Mazars, P., Barboule, N., Baldin, V., Vidal, S., Ducommun, B., & Valette, A. (1995). Effects of TGF-beta 1 (transforming growth factor-beta 1) on the cell cycle regulation of human breast adenocarcinoma (MCF-7) cells. FEBS Letters, 362(3), 295–300.
Mitchell, T. M. (1997). Machine learning. The McGraw-Hill Companies, Inc., New York.
Moorman, A. M., Vink, R., Heijmans, H. J., van der Palen, J., & Kouwenhoven, E. A. (2012). The prognostic value of tumour-stroma ratio in triple-negative breast cancer. European Journal of Surgical Oncology : the Journal of the European Society of Surgical Oncology and the British Association of Surgical Oncology, 38(4), 307–313.
Muraoka, R. S., Dumont, N., Ritter, C. A., Dugger, T. C., Brantley, D. M., Chen, J., Easterly, E., Roebuck, L. R., Ryan, S., Gotwals, P. J., Koteliansky, V., & Arteaga, C. L. (2002). Blockade of TGF-beta inhibits mammary tumor cell viability, migration, and metastases. The Journal of Clinical Investigation, 109(12), 1551–1559.
Padua, D., Zhang, X. H., Wang, Q., Nadal, C., Gerald, W. L., Gomis, R. R., & Massague, J. (2008). TGFbeta primes breast tumors for lung metastasis seeding through angiopoietin-like 4. Cell, 133(1), 66–77.
Parisi, F., Gonzalez, A. M., Nadler, Y., Camp, R. L., Rimm, D. L., Kluger, H. M., & Kluger, Y. (2010). Benefits of biomarker selection and clinico-pathological covariate inclusion in breast cancer prognostic models. Breast Cancer Research, 12(5), R66.
Quinlan, J. R. (1993). C4.5 : programs for machine learning. San Mateo: Morgan Kaufmann Publishers.
Shi, Y., & Massague, J. (2003). Mechanisms of TGF-beta signaling from cell membrane to the nucleus. Cell, 113(6), 685–700.
Zhang, Y., Schnabel, C. A., Schroeder, B. E., Jerevall, P. L., Jankowitz, R. C., Fornander, T., Stal, O., Brufsky, A. M., Sgroi, D., & Erlander, M. G. (2013). Breast cancer index identifies early-stage estrogen receptor-positive breast cancer patients at risk for early- and late-distant recurrence. Clinical Cancer Research: An Official Journal of the American Association for Cancer Research, 19(15), 4196–4205.
Acknowledgements
We would like to thank Drs. C. C. Engels, J. W. T. Dekker and E. M. de Kruijf for conducting immunohistochemistry staining, evaluating stroma percentage and recording original data; and Drs. A. Dibrov and Catalin Mihalcioiu for valuable discussions. J. Guo is supported by a Traineeship from the Breast Cancer Research Program of Congressionally Directed Medical Research Program (CDMRP). B. C. M. Fung is a Canada Research Chair in Data Mining for Cybersecurity. J.-J. Lebrun is a Sir William Dawson Research Chair of McGill University. This work was supported in part by grants from the Canadian Institutes for Health Research (CIHR) (fund codes 230670 and 233716 to J.-J. Lebrun), the Natural Sciences and Engineering Research Council of Canada (NSERC) Discovery Grants (fund code 356065-2013 to B. C. M. Fung), Canada Research Chairs Program (fund code 950-230623 to B. C. M. Fung), and Zayed University Research Incentive Fund and Research Cluster Award (fund codes R15048 and R16083 to F. Iqbal and B. C. M. Fung).
Author information
Authors and Affiliations
Contributions
J. Guo, B. C. M. Fung and J.-J. Lebrun designed the study, analyzed and interpreted the results. F. Iqbal participated in interpreting the results. P. J. K. Kuppen, R. A. E. M. Tollenaar and W. E. Mesker collected patient samples and designed the tumor tissue microarrays.
Corresponding authors
Ethics declarations
Conflict of interest
The authors declare that they have no conflict of interest.
Rights and permissions
About this article
Cite this article
Guo, J., Fung, B.C.M., Iqbal, F. et al. Revealing determinant factors for early breast cancer recurrence by decision tree. Inf Syst Front 19, 1233–1241 (2017). https://doi.org/10.1007/s10796-017-9764-0
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10796-017-9764-0