Abstract
This paper investigates the integration of clinico-pathological and microRNA data for breast cancer relapse prediction. Clinical and pathological data proved to be relevant in making predictions about cancer disease outcome. The most accurate predictive models can be obtained by using clinico-pathological information together with genomic information. We analyzed the performance of various combinations between twenty classification algorithms and thirteen feature selection methods. The best performer was the regularized regression method Elastic Net, using its built-in feature selection method, on the data set integrating clinico-pathological data with microRNAs. The hybrid signature contains four clinico-pathological features and fifteen microRNAs. We also evaluated the influence of the separation of patients according to ER status and the impact of the exclusion from the data set of HS molecules (novel microRNAs without an assigned miRBase ID) on the overall performance. Functional analysis of the microRNAs of the best classifier showed that they are involved in cancer related processes.
This project has been conducted through the program Partnerships in priority areas - PN II, developed with the support of ANCS, CNDI - UEFISCDI, project no. PN-II-PT-CACM-2011-3.1-1221.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
Componentwise Boosting, Diagonal Discriminant Analysis, Elastic Net, Fisher Discriminant Analysis, Tree-based Boosting, k-nearest neighbors, Linear Discriminant Analysis, Lasso, Feed-Forward Neural Networks, Probabilistic nearest neighbors, Penalized Logistic Regression, Partial Least Squares with Linear Discriminant Analysis, Partial Least Squares with logistic regression, Partial Least Squares with Random Forest, Probabilistic Neural Networks, Quadratic Discriminant Analysis, Random Forest, PAM, Shrinkage Discriminant Analysis, Support Vector Machine.
- 2.
t test, Welch test, Wilcox test, F test, Kruskal-Wallis test, moderated t and F test (limma), One-step Recursive Feature Elimination, random forest variable importance measure, Lasso, Elastic Net, componentwise boosting, Golub ad-hoc criterium, shrinkcat.
References
Metacore gene expression and pathway analysis. http://www.genego.com/metacore.php
Antonov, A.V., Knight, R.A., Melino, G., Barlev, N.A., Tsvetkov, P.O.: Mirumir: an online tool to test micrornas as biomarkers to predict survival in cancer using multiple clinical data sets. Cell Death Differ. 20(2), 367 (2013). http://dx.doi.org/10.1038/cdd.2012.137L3
Bergamaschi, A., Katzenellenbogen, B.S.: Tamoxifen downregulation of mir-451 increases 14-3-3zeta and promotes breast cancer cell survival and endocrine resistance. Oncogene 31(1), 39–47 (2012)
Breiman, L.: Random forests. Mach. Learn. 45(1), 5–32 (2001)
Buelmann, P., Yu, B.: Boosting with the l2 loss: regression and classification. J. Am. Stat. Assoc. 98, 324–339 (2003)
Buffa, F.M., Camps, C., Winchester, L., Snell, C.E., Gee, H.E., Sheldon, H., Taylor, M., Harris, A.L., Ragoussis, J.: Microrna-associated progression pathways and potential therapeutic targets identified by integrated mrna and microrna expression profiling in breast cancer. Cancer Res. 71(17), 5635–5645 (2011)
Burns, L.J., Weisdorf, D.J., et al.: Il-2-based immunotherapy after autologous transplantation for lymphoma and breast cancer induces immune activation and cytokine release: a phase i/ii trial. Bone Marrow Transplant. 32(2), 177–186 (2003)
Calin, G.A., Croce, C.M.: MicroRNA signatures in human cancers. Nat. Rev. Cancer 6(11), 857–866 (2006)
Castellano, L., Giamas, G., et al.: The estrogen receptor-alpha-induced microrna signature regulates itself and its transcriptional response. Proc. Natl. Acad. Sci. USA 106(37), 15732–15737 (2009)
Chen, J., Bardes, E., Aronow, B., Jegga, A.: Toppgene suite for gene list enrichment analysis and candidate gene prioritization. Nucleic Acids Res. 37(suppl 2), W305–W311 (2009)
Duda, R., Hart, P., Stork, D.: Pattern Classification. Wiley-Interscience, New-York (2001)
Edén, P., Ritz, C., Rose, C., Fernö, M., Peterson, C.: “Good old” clinical markers have similar power in breast cancer prognosis as microarray gene expression profilers. Eur. J. Cancer 40, 1837–1841 (2004)
Edgar, R., Domrachev, M., Lash, A.E.: Gene expression omnibus: NCBI gene expression and hybridization array data repository. Nucleic Acids Res. 30(1), 207–210 (2002)
Eifel, P., Axelson, J.A., Costa, J., Crowley, J., Curran, W.J., Deshler, A., Fulton, S., Hendricks, C.B., Kemeny, M., Kornblith, A.B., Louis, T.A., Markman, M., Mayer, R., Roter, D.: National institutes of health consensus development conference statement: adjuvant therapy for breast cancer, November 1–3, 2000. J. natl. cancer inst. 93(13), 979–989 (2001)
Famili, F., Phan, S., Fauteux, F., Liu, Z., Pan, Y.: Data integration and knowledge discovery in life sciences. In: García-Pedrajas, N., Herrera, F., Fyfe, C., Benítez, J.M., Ali, M. (eds.) IEA/AIE 2010, Part III. LNCS (LNAI), vol. 6098, pp. 102–111. Springer, Heidelberg (2010)
Floares, A., Birlutiu, A.: Decision tree models for developing molecular classifiers for cancer diagnosis. In: The 2012 International Joint Conference on Neural Networks (IJCNN), pp. 1–7. IEEE (2012)
Fontana, L., Pelosi, E. et al.: MicroRNAs 17–5p-20a-106a control monocytopoiesis through AML1 targeting and M-CSF receptor upregulation. Nat. Cell Biol. 9(7), 775–787 (2007). http://dx.doi.org/10.1038/ncb1613
Friedman, J., Trevor, H., Tibshirani, R.: Regularization paths for generalized linear models via coordinate descent. J. Stat. Softw. 33(1), 1–22 (2010). http://www.jstatsoft.org/v33/i01/
Gaffen, S.L., Liu, K.D.: Overview of interleukin-2 function, production and clinical applications. Cytokine 28(3), 109–123 (2004). http://www.sciencedirect.com/science/article/pii/S1043466604002200
Gevaert, O., Smet, F.D., Timmerman, D., Moreau, Y., Moor, B.D.: Predicting the prognosis of breast cancer by integrating clinical and microarray data with bayesian networks. Bioinformatics 22(14), e184–e190 (2006)
Goldhirsch, A., Wood, W.C., Gelber, R.D., Coates, A.S., Thürlimann, B., Senn, H.J.: Meeting highlights: updated international expert consensus on the primary therapy of early breast cancer. J. Clin. Oncol. 21(17), 3357–3365 (2003)
González, S., Guerra, L., Robles, V., Peña, J., Famili, F.: Clidapa: a new approach to combining clinical data with dna microarrays. Intell. Data Anal. 14(2), 207–223 (2010)
Guo, L., Zhao, Y., Yang, S., Cai, M., Wu, Q., Chen, F.: Genome-wide screen for aberrantly expressed mirnas reveals mirna profile signature in breast cancer. Mol. Biol. Rep. 40(3), 2175–2186 (2013)
Han, Y., Chen, J., et al.: MicroRNA expression signatures of bladder cancer revealed by deep sequencing. PLoS ONE 6(3), 6 (2011)
Hanahan, D., Weinberg, R.: Hallmarks of cancer: the next generation. Cell 144(5), 646–674 (2011)
He, Y., Cui, Y., et al.: Hypomethylation of the hsa-mir-191 locus causes high expression of hsa-miR-191 and promotes the epithelial-to-mesenchymal transition in hepatocellular carcinoma. Neoplasia 13(9), 841–853 (2011)
da Huang, W., Sherman, B., Lempicki, R.: Systematic and integrative analysis of large gene lists using david bioinformatics resources. Nature Protoc. 1, 44–57 (2008)
Ioannidis, J.P.: Microarrays and molecular research: noise discovery? Lancet 365(9458), 454–455 (2005)
Kozomara, A., Griffiths-Jones, S.: miRBase: integrating microRNAannotation and deep-sequencing data. Nucleic Acids Res. 39(Database-Issue), 152–157 (2011). http://dblp.uni-trier.de/db/journals/nar/nar39.html#KozomaraG11d
Li, Q.Q., Chen, Z.Q., et al.: Involvement of NF-kappaB/miR-448 regulatory feedback loop in chemotherapy-induced epithelial-mesenchymal transition of breast cancer cells. Cell Death Differ. 18(1), 16–25 (2011)
Ma, J., Jemal, A.: Breast cancer statistics. In: Ahmad, A. (ed.) Breast Cancer Metastasis and Drug Resistance, pp. 1–18. Springer, New York (2013)
Massague, J.: TGFbeta in cancer. Cell 134(2), 215–230 (2008)
Mosakhani, N., Guled, M., et al.: An integrated analysis of miRNA and gene copy numbers in xenografts of Ewing’s sarcoma. J. Exp. Clin. Cancer Res. 31, 24 (2012)
R Development Core Team: R: A language and environment for statistical computing. R Foundation for Statistical Computing 1(2.11.1), 409 (2011). http://www.r-project.org
Rocha, R.L., Hilsenbeck, S.G., et al.: Insulin-like growth factor binding protein-3 and insulin receptor substrate-1 in breast cancer: correlation with clinical parameters and disease-free survival. Clin. Cancer Res. 3(1), 103–109 (1997)
Schoeffner, D.J., Matheny, S.L., et al.: VEGF contributes to mammary tumor growth in transgenic mice through paracrine and autocrine mechanisms. Lab Invest. 85(5), 608–623 (2005)
Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT Press, Cambridge (2002). http://www.learning-with-kernels.org
Slawski, M., Boulesteix, A.L., Bernau., C.: CMA: Synthesis of microarray-based classification, r package version 1.16.0. (2009)
Sun, Y., Goodison, S., Li, J., Liu, L., Farmerie, W.: Improved breast cancer prognosis through the combination of clinical and genetic markers. Bioinformatics 23(1), 30–37 (2007)
Turner, B.C., Haffty, B.G., et al.: Insulin-like growth factor-I receptor overexpression mediates cellular radioresistance and local breast cancer recurrence after lumpectomy and radiation. Cancer Res. 57(15), 3079–3083 (1997)
van’t Veer, L.J., Dai, H., Van De Vijver, M.J., He, Y.D., Hart, A.A., Mao, M., Peterse, H.L., van der Kooy, K., Marton, M.J., Witteveen, A.T., et al.: Gene expression profiling predicts clinical outcome of breast cancer. Nature 415(6871), 530–536 (2002)
van Vliet, M.H., Horlings, H.M., van de Vijver, M.J., Reinders, M.J., Wessels, L.F.: Integration of clinical and gene expression data has a synergetic effect on predicting breast cancer outcome. PLoS ONE 7(7), e40358 (2012)
Volinia, S., Calin, G.A., et al.: A microRNA expression signature of human solid tumors defines cancer gene targets. Proc. Natl. Acad. Sci. USA 103(7), 2257–2261 (2006)
Wang, F., Zheng, Z., Guo, J., Ding, X.: Correlation and quantitation of microRNA aberrant expression in tissues and sera from patients with breast tumor. Gynecol. Oncol. 119(3), 586–593 (2010)
Wong, J.: Package ‘imputation’, version 2.0.1. https://github.com/jeffwong/imputation
Xiao-Hua, Z., Obuchowski, N., McClish, D.: Statistical methods in diagnostic medicine (2002)
Yi, H., Liang, B., et al.: Differential roles of miR-199a-5p in radiation-induced autophagy in breast cancer cells. FEBS Lett. 587(5), 436–443 (2013)
Zhu, H., Wu, H., Liu, X., Evans, B.R., Medina, D.J., Liu, C.G., Yang, J.M.: Role of microRNA miR-27a and miR-451 in the regulation of MDR1/P-glycoprotein expression in human cancer cells. Biochem. Pharmacol. 76(5), 582–588 (2008)
Zhu, J., Hastie, T.: Classification of gene microarrays by penalized logistic regression. Biostatistics 5(3), 427–443 (2004)
Zou, H., Hastie, T.: Regularization and variable selection via the elastic net. J. Roy. Stat. Soc.: Ser. B (Stat. Methodol.) 67(2), 301–320 (2005)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer International Publishing Switzerland
About this paper
Cite this paper
Birlutiu, A., Ardevan, D., Bulzu, P., Pintea, C., Floares, A. (2014). Integration of Clinico-Pathological and microRNA Data for Intelligent Breast Cancer Relapse Prediction Systems. In: Formenti, E., Tagliaferri, R., Wit, E. (eds) Computational Intelligence Methods for Bioinformatics and Biostatistics. CIBB 2013. Lecture Notes in Computer Science(), vol 8452. Springer, Cham. https://doi.org/10.1007/978-3-319-09042-9_13
Download citation
DOI: https://doi.org/10.1007/978-3-319-09042-9_13
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-09041-2
Online ISBN: 978-3-319-09042-9
eBook Packages: Computer ScienceComputer Science (R0)