Abstract
The accessibility to “big data” sets down an ambitious challenge in the medical field, especially in personalized medicine, where gene expression data are increasingly being used to establish a diagnosis and optimize treatment of oncological patients. However, the high-dimensionality nature of the data brings many constraints, for which several approaches have been considered, with regularization techniques in the cutting-edge research front. Additionally, the network structure of gene expression data has fostered the development of network-based regularization techniques to convey data into a low-dimensional and interpretable level. In this work, classical elastic net and two recently proposed network-based methods, HubCox and OrphanCox, are applied to high-dimensional gene expression data, to model survival data. An oncological transcriptomic dataset obtained from The Cancer Genome Atlas (TCGA) is used, with patients’ RNA-seq measurements as covariates. The application of sparsity-inducing techniques to the dataset enabled the selection of relevant genes over a range of parameters evaluated. Comparable results were obtained for the elastic net and the network-based OrphanCox regarding model performance and genes selected.
Partially funded by H2020 (No. 633974) and the Portuguese Foundation for Science & Technology FCT (UIDB/00297/2020, UIDB/04516/2020, UIDB/50021/2020, UIDB/50022/2020, PTDC/CCI-CIF/29877/2017, PTDC/CCI-INF/29168/2017 and SFRH/BD/97415/2013).
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Aoude, L.G., et al.: Pole mutations in families predisposed to cutaneous melanoma. Fam. Cancer 14(4), 621–628 (2015). https://doi.org/10.1007/s10689-015-9826-8
Baker, S., et al.: Cancer Hallmarks Analytics Tool (CHAT): a text mining approach to organize and evaluate scientific literature on cancer. Bioinformatics 33(24), 3973–3981 (2017). https://doi.org/10.1093/bioinformatics/btx454
Cox, D.R.: Regression models and life-tables. J. Roy. Stat. Soc. Ser. B (Methodol.) 34(2), 187–220 (1972). http://www.jstor.org/stable/2985181
Degenhardt, Y., et al.: Distinct MHC gene expression patterns during progression of melanoma. Genes Chromosom. Cancer 49(2), 144–154 (2010). https://doi.org/10.1002/gcc.20728. https://onlinelibrary.wiley.com/doi/abs/10.1002/gcc.20728
El-Wahab, N., et al.: Glypican-3 and melanoma antigen genes 1 and 3 as tumor markers for hepatocellular carcinoma. Egypt. J. Immunol. 24(2), 187–200 (2017)
Nieminen, J.: On the centrality in a graph. Scand. J. Psychol. 15(1), 332–336 (1974). https://doi.org/10.1111/j.1467-9450.1974.tb00598.x. http://onlinelibrary.wiley.com/doi/10.1111/j.1467-9450.1974.tb00598.x/abstract
Peto, R., Peto, J.: Asymptotically efficient rank invariant test procedures. J. Roy. Stat. Soc. Ser. A (Gen.) 135(2), 185–207 (1972). http://www.jstor.org/stable/2344317
Planelles, D., et al.: HLA class II polymorphisms in Spanish melanoma patients: homozygosity for HLA-DQA1 locus can be a potential melanoma risk factor. Br. J. Dermatol. 154(2), 261–266 (2006). https://doi.org/10.1111/j.1365-2133.2005.06896.x. https://onlinelibrary.wiley.com/doi/abs/10.1111/j.1365-2133.2005.06896.x
Team, R.C.: R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna, Austria (2012). http://www.R-project.org/
Tibshirani, R.: Regression shrinkage and selection via the lasso. J. Roy. Stat. Soc. Ser. B 58(1), 267–288 (1996)
Veríssimo, A., Carrasquinha, E., Lopes, M., Oliveira, A., Sagot, M.F., Vinga, S.: Sparse network-based regularization for the analysis of patientomics high-dimensional survival data. bioRxiv (2018). https://doi.org/10.1101/403402
Veríssimo, A., Carrasquinha, E., Lopes, M.B., Vinga, S.: glmSparseNet - network centrality metrics for elastic-net regularized models. Bioconductor (2018). https://bioconductor.org/packages/release/bioc/html/glmSparseNet.html
Veríssimo, A., Oliveira, A.L., Sagot, M.F., Vinga, S.: DegreeCox - a network-based regularization method for survival analysis. BMC Bioinformatics 17(16), 449 (2016). https://doi.org/10.1186/s12859-016-1310-4
Yu, N., Shin, S., Choi, J., Kim, Y., Lee, K.: Concomitant AID expression and BCL7A loss associates with accelerated phase progression and imatinib resistance in chronic myeloid leukemia. Ann. Lab. Med. 37(2), 177–179 (2017). https://doi.org/10.3343/alm.2017.37.2.177
Zou, H., Hastie, T.: Regularization and variable selection via the elastic net. J. Roy. Stat. Soc. Ser. B 67(2), 301–320 (2005)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Switzerland AG
About this paper
Cite this paper
Carrasquinha, E., Veríssimo, A., Lopes, M.B., Vinga, S. (2020). Network-Based Variable Selection for Survival Outcomes in Oncological Data. In: Rojas, I., Valenzuela, O., Rojas, F., Herrera, L., Ortuño, F. (eds) Bioinformatics and Biomedical Engineering. IWBBIO 2020. Lecture Notes in Computer Science(), vol 12108. Springer, Cham. https://doi.org/10.1007/978-3-030-45385-5_49
Download citation
DOI: https://doi.org/10.1007/978-3-030-45385-5_49
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-45384-8
Online ISBN: 978-3-030-45385-5
eBook Packages: Computer ScienceComputer Science (R0)