Abstract
Micro RNA (miRNA) plays important roles in a variety of biological processes and can act as disease biomarkers. Thus, establishment of discovery methods to detect disease-related miRNAs is warranted. Human omics data including miRNA expression profiles have orders of magnitude with much more number of descriptors (p) than that of samples (n), which is so called “p > > n problem”. Since traditional statistical methods mislead to localized solutions, application of machine learning (ML) methods that handle sparse selection of the variables are expected to solve this problem. Among many ML methods, least absolute shrinkage and selection operator (LASSO) and multivariate adaptive regression splines (MARS) give a few variables from the result of supervised learning with endpoints such as human disease statuses. Here, we performed systematic comparison of LASSO and MARS to discover biomarkers, using six miRNA expression data sets of human disease samples, which were obtained from NCBI Gene Expression Omnibus (GEO). We additionally conducted partial least square method discriminant analysis (PLS-DA), as a control traditional method to evaluate baseline performance of discriminant methods. We observed that LASSO and MARS showed relatively higher performance compared to that of PLS-DA, as the number of the samples increases. Also, some of the identified miRNA species by ML methods have already been reported as candidate disease biomarkers in the previous biological studies. These findings should contribute to the extension of our knowledge on ML method performances in empirical utilization of clinical data.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Ruvkun, G.: Molecular biology, Glimpses of a tiny RNA world. Science 294, 797–799 (2001)
Ambros, V., Bartel, B., Bartel, D.P., Burge, C.B., Carrington, J.C., et al.: A uniform system for microRNA annotation. RNA 9, 277–279 (2003)
Ebert, M.S., Sharp, P.: Roles for microRNAs in conferring robustness to biological processes. Cell 149, 215–424 (2012)
Kozomara, A., Griffiths-Jones, S.: miRBase: annotating high confidence microRNAs using deep sequencing data. Nucleic Acids Res. 42, D68–D73 (2014)
Medina, P.P., Nolde, M., Slack, F.: OncomiR addiction in an in vivo model of microRNA-21-induced pre-B-cell lymphoma. Nature 467, 86–90 (2010)
O’Connell, R.M., Kahn, D., Gibson, W.S., Round, J.L., Scholz, R.L., et al.: MicroRNA-155 promotes autoimmune inflammation by enhancing inflammatory T cell development. Immunity 33, 607–619 (2010)
Jangra, R.K., Yi, M., Lemon, S.: Regulation of hepatitis C virus translation and infectious virus production by the microRNA miR-122. J. Virol. 84, 6615–6625 (2010)
Kovalchuk, O., Filkowski, J., Meservy, J., Ilnytskyy, Y., Tryndyak, V.P., et al.: Involvement of microRNA-451 in resistance of the MCF-7 breast cancer cells to chemotherapeutic drug doxorubicin. Mol. Cancer Ther. 7, 2152–2159 (2008)
Guo, J.-X., Tao, Q.-S., Lou, P.-R., Chen, X., Chen, J., et al.: miR-181b as a potential molecular target for anticancer therapy of gastric neoplasms. Asian Pac. J. Cancer Prev. 13, 2263–2267 (2012)
Hastie, T., Tibshirani, R.: Efficient quadratic regularization for expression arrays. Biostatistics 5(3), 329–340 (2004)
Fan, C., Oh, D.S., Wessels, L., Weigelt, B., Nuyten, D.S., et al.: Concordance among gene-expression-based predictors for breast cancer. N. Engl. J. Med. 355, 560–569 (2006)
Tibshirani, R.: Regression shrinkage and selection via the lasso. J. Royal. Statist. Soc B 58, 267–268 (1996)
Friedman, J.: Multivariate adaptive regression splines. The Annals of Statistics 19, 1–67 (1991)
Søkilde, R., Vincent, M., Møller, A.K., Hansen, A., Høiby, P.E., et al.: Efficient identification of miRNAs for classification of tumor origin. J. Mol. Diagn. 16, 106–115 (2014)
Zhang, H., Yang, S., Guo, L., Zhao, Y., Shao, F., et al.: Comparisons of isomiR patterns and classification performance using the rank-based MANOVA and 10-fold cross-validation. Gene (2014)
Taguchi, Y.-H., Murakami, Y.: Universal disease biomarker: can a fixed set of blood microRNAs diagnose multiple diseases? BMC Res. Notes 7, 581 (2014)
R.A.: language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, http://www.R-project.org/
Friedman, J.H., Hastie, T., Tibshirani, R.: Regularization Paths for Generalized Linear Models via Coordinate Descent. Journal of Statistical Software 33 (1), 1–22, http://www.jstatsoft.org/v33/i01/
Milborrow, S., Derived from mda:mars by Hastie, R., Tibshirani, R.: Uses Alan Miller’s Fortran utilities with Thomas Lumley’s leaps wrapper. earth: Multivariate Adaptive Regression Spline Models. R package version 3.2-7 (2014), http://CRAN.R-project.org/package=earth
Kuhn, M.: Contributions from Wing, J., Weston, S., Williams, A., Keefer, A., Engelhardt. A., et al.: caret: Classification and Regression Training. R package version 6.0-37 http://CRAN.R-project.org/package=caret.2014
Geisser, S.: Predictive Inference (1993) ISBN 0-412-03471-9
Wang, C., Yang, S., Sun, G., Tang, X., Lu, S., et al.: Comparative miRNA expression profiles in individuals with latent and active tuberculosis. PLoS One 6, e25832 (2011)
Murakami, Y., Toyoda, H., Tanahashi, T., Tanaka, J., Kumada, T., et al.: Comprehensive miRNA expression analysis in peripheral blood can diagnose liver disease. PLoS One 7, e48366 (2012)
Maertzdorf, J., Weiner III, J., Mollenkopf, H.J., TBornotTB Network and Bauer, T., et al.: Common patterns and disease-related signatures in tuberculosis and sarcoidosis. Proc. Natl. Acad. Sci. 109, 7853–7858 (2012)
Vuppalanchi, R., Liang, T., Goswami, C.P., Nalamasu, R., Li, L., et al.: Relationship between differential hepatic microRNA expression and decreased hepatic cytochrome P450 3A activity in cirrhosis. PLoS One 8, e74471 (2013)
Smigielska-Czepiel, K., van den Berg, A., Jellema, P., van der Lei, R.J., Bijzet, J., et al.: Comprehensive analysis of miRNA expression in T-cell subsets of rheumatoid arthritis patients reveals defined signatures of naive and memory Tregs. Genes Immun. 15, 115–125 (2014)
Plieskatt, J.L., Rinaldi, G., Feng, Y., Peng, J., Yonglitthipagon, P., et al.: Distinct miRNA signatures associate with subtypes of cholangiocarcinoma from infection with the tumourigenic liver fluke Opisthorchis viverrini. J. Hepatol. 61, 850–858 (2014)
Jopling, C.L., Yi, M., Lancaster, A.M., Lemon, S.M., Sarnow, P.: Modulation of hepatitis C virus RNA abundance by a liver-specific MicroRNA. Science 309, 1577–1581 (2005)
Nakasa, T., Miyaki, T., Okubo, S., Hashimoto, A., Nishida, M., et al.: Expression of micro RNA-146 in rheumatoid arthritis synovial tissue. Arthritis Rheum. 58, 1284–1292 (2008)
Estep, M., Armistead, D., Hossain, N., Elarainy, H., Goodman, Z., et al.: Differential expression of miRNAs in the visceral adipose tissue of patients with non-alcoholic fatty liver disease. Aliment. Pharmacol. Ther. 32(3), 487–497 (2010)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer International Publishing Switzerland
About this paper
Cite this paper
Higuchi, C., Tanaka, T., Okada, Y. (2015). Systematic Comparison of Machine Learning Methods for Identification of miRNA Species as Disease Biomarkers. In: Ortuño, F., Rojas, I. (eds) Bioinformatics and Biomedical Engineering. IWBBIO 2015. Lecture Notes in Computer Science(), vol 9044. Springer, Cham. https://doi.org/10.1007/978-3-319-16480-9_38
Download citation
DOI: https://doi.org/10.1007/978-3-319-16480-9_38
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-16479-3
Online ISBN: 978-3-319-16480-9
eBook Packages: Computer ScienceComputer Science (R0)