ABSTRACT
In digital era, time-to-event data collected from biomedical studies and healthcare are often of high dimensionality, presenting computational challenges for traditional survival models. To make full use of these data, feature selection (FS), a data processing technique for dimensionality reduction, shows great significance. This work introduces statistical, machine learning, and deep learning FS methods for time-to-event data, mainly focusing on lasso, elastic net, adaptive lasso, adaptive elastic net, random survival forest, and XGBoost. We also describe three state-of-art FS methods – BASIL, FilterDeepHit+, and SparseDeepHit+. Then, we compare C-Index of 4 basic FS methods in experiment. Finally, we discuss future challenges and draw a conclusion.
- Kleinbaum, David G., and Mitchel Klein. "Survival analysis. Statistics for biology and health." Survival 510 (2005).Google Scholar
- Wang W, Liu W. Integration of gene interaction information into a reweighted Lasso-Cox model for accurate survival prediction[J]. Bioinformatics, 2020, 36(22-23). https://doi.org/10.1093/bioinformatics/btaa1046Google Scholar
- Qing Z, Xingjie S, Yang X, Combining multidimensional genomic measurements for predicting cancer prognosis: observations from TCGA. [J]. Briefings in Bioinformatics(2):291. https://doi.org/10.1093/bib/bbu003Google Scholar
- Ren Z, Zhang L, Ding W, Development and validation of a novel survival model for head and neck squamous cell carcinoma based on autophagy-related genes[J]. Genomics, 2020, 113(1). https://doi.org/10.1016/j.ygeno.2020.11.017Google Scholar
- Kevin, He, Yue, An improved variable selection procedure for adaptive Lasso in high-dimensional survival analysis.[J]. Lifetime Data Analysis, 2018.https://doi.org/10.1007/s10985-018-9455-2Google Scholar
- Saha S, Ryu D, Ebrahimi N. Variable Selection with Random Survival Forest and Bayesian Additive Regression Tree for Survival Data[J]. 2019.https://doi.org/Google Scholar
- Walschaerts M, Leconte E, Besse P. Stable variable selection for right censored data: comparison of methods[J]. Tse Working Papers, 2012.https://doi.org/10.48550/arXiv.1203.4928Google Scholar
- Ni A, Cai J, Zeng D. Variable selection for case-cohort studies with failure time outcome[J]. Biometrika, 2016, 103(3):547-562.https://doi.org/10.1093/biomet/asw027Google ScholarCross Ref
- Wright R B E M. Adaptive Control Processes: a Guided Tour. By Richard Bellman. 1961. 42s. Pp. xvi + 255. (Princeton University Press)[J]. The Mathematical Gazette, 1962, 46(356): xvi-161.https://doi.org/10.2307/3611672Google Scholar
- Rietschel C, Yoon J, Mihaela V. Feature Selection for Survival Analysis with Competing Risks using Deep Learning[J]. 2018.https://doi.org/10.48550/arXiv.1811.09317Google Scholar
- Shahraki H R, Salehi A, Zare N. Survival Prognostic Factors of Male Breast Cancer in Southern Iran: a LASSO-Cox Regression Approach[J]. Asian Pac J Cancer Prev, 2015, 16(15):6773-6777.https://doi.org/10.7314/APJCP.2015.16.15.6773Google ScholarCross Ref
- Kim J, Sohn I, Jung S H, Analysis of Survival Data with Group Lasso[J]. Communications in Statistics - Simulation and Computation, 2012, 41(9).https://doi.org/10.1080/03610918.2011.611311Google Scholar
- Cunningham P, Kathirgamanathan B, Delany S J. Feature Selection Tutorial with Python Examples[J]. 2021.https://doi.org/10.48550/arXiv.2106.06437Google Scholar
- Shen Z, Wang H, Zhang Z, A fast adaptive Lasso for the cox regression via safe screening rules[J]. Journal of Statistical Computation and Simulation, 2021, 91(14):3005-3027.https://doi.org/10.1080/00949655.2021.1914043Google ScholarCross Ref
- Attallah O, Karthikesalingam A, Holt P J E, feature selection through validation and un-censoring of endovascular repair survival data for predicting the risk of re- intervention[J]. 2018.https://doi.org/10.1186/s12911-017-0508-3Google Scholar
- Li J, Cheng K, Wang S, Feature Selection: A Data Perspective[J]. Acm Computing Surveys, 2016, 50(6).https://doi.org/10.1145/3136625Google ScholarDigital Library
- Remeseiro B , Bolon-Canedo V . A review of feature selection methods in medical applications[J]. Computers in Biology and Medicine, 2019:103375.Google Scholar
- Robert, Tibshirani. Regression Shrinkage and Selection via the Lasso[J]. Journal of the Royal Statistical Society. Series B (Methodological), 1996.https://doi.org/10.2307/2346178Google Scholar
- JL Jiménez, Dorronsoro J R. Proximal Methods for Lasso Penalties in the Cox Proportional Hazards Model.Google Scholar
- Zou H, Hastie T. Addendum: "Regularization and variable selection via the elastic net'' [J. R. Stat. Soc. Ser. B Stat. Methodol. 67 (2005), no. 2, 301–320; MR2137327].[J]. journal of the royal statistical society, 2010, 67(5):768-768.https://doi.org/10.1111/j.1467-9868.2005.00527.xGoogle Scholar
- Simon N, Friedman J H, Hastie T, Regularization Paths for Cox's Proportional Hazards Model via Coordinate Descent[J]. Journal of Statistical Software, 2011, 39(05):1-13.https://doi.org/10.18637/jss.v039.i05Google ScholarCross Ref
- Zou, Hui. The Adaptive Lasso and Its Oracle Properties[J]. Publications of the American Statistical Association, 2006, 101(476):1418-1429.https://doi.org/10.1198/016214506000000735Google ScholarCross Ref
- Zhang, Hao Helen, and Wenbin Lu. "Adaptive Lasso for Cox's proportional hazards model." Biometrika 94.3 (2007): 691-703.Google ScholarCross Ref
- Zou H, Zhang H H. On the adaptive elastic-net with a diverging number of parameters[J]. Annals of Statistics, 2009, 37(4):1733-1751.https://doi.org/10.1214/08-AOS625Google ScholarCross Ref
- Ishwaran, H., Kogalur, U. B., Blackstone, E. H., & Lauer, M. S. (2008). Random survival forests. The Annals of Applied Statistics, 2(3), 841–860.https://doi.org/10.1214/08-AOAS169Google Scholar
- Binder H. CoxBoost: Cox Models by Likelihood Based Boosting for a Single Survival Endpoint or Competing Risks, 2013. https://CRAN.R-project.org/package=CoxBoost (17 October 2019, date last accessed).Google Scholar
- Hothorn T, Bühlmann P, Kneib T, et al. mboost: Model-Based Boosting, 2018. https://CRAN.R-project.org/package=mboost (17 October 2019, date last accessed).Google Scholar
- Chen T, Guestrin C. XGBoost: A Scalable Tree Boosting System[J]. ACM, 2016.https://doi.org/10.1145/2939672.2939785Google ScholarDigital Library
- Belle V V. Support Vector Machine for Survival Analysis. 2007. 1-8. 2007Google Scholar
- Evers, Ludger, and Claudia-Martina Messow. "Sparse kernel methods for high-dimensional survival data." Bioinformatics 24.14 (2008): 1632-1638.Google ScholarDigital Library
- Van Belle, V., Pelckmans, K., Suykens, J.A., Van Huffel, S., “Survival SVM: a practical scalable algorithm”, In: Proc. of 16th European Symposium on Artificial Neural Networks, 89-94, 2008.Google Scholar
- Ching T, Zhu X, Garmire LX (2018) Cox-nnet: An artificial neural network method for prognosis prediction of high-throughput omics data. PLOS Computational Biology 14(4): e1006076. https://doi.org/10.1371/journal.pcbi.1006076Google Scholar
- Katzman, J.L., Shaham, U., Cloninger, A. DeepSurv: personalized treatment recommender system using a Cox proportional hazards deep neural network. BMC Med Res Methodol 18, 24 (2018). https://doi.org/10.1186/s12874-018-0482-1Google ScholarCross Ref
- Lee, C., Zame, W., Yoon, J. and van der Schaar, M. 2018. DeepHit: A Deep Learning Approach to Survival Analysis With Competing Risks. Proceedings of the AAAI Conference on Artificial Intelligence. 32, 1 (Apr. 2018). DOI:https://doi.org/10.1609/aaai.v32i1.11842.Google Scholar
- W. A. Knaus, F. E. Harrell, J. Lynn, L. Goldman, R. S. Phillips, A. F. Connors, N. V. Dawson, W. J. Fulkerson, R. M. Califf, N. Desbiens, , “The support prognostic model: objective estimates of survival for seriously ill hospitalized adults,” Annals of internal medicine, vol. 122, no. 3, pp. 191–203, 1995.Google ScholarCross Ref
- Curtis C, Shah SP, Chin S-F, Turashvili G, Rueda OM, Dunning MJ, Speed D, Lynch AG, Samarajiwa S, Yuan Y, The genomic and transcriptomic architecture of 2,000 breast tumours reveals novel subgroups. Nature. 2012;486(7403):346–52.Google Scholar
- Hosmer DW, Lemeshow S, May S. Applied Survival Analysis: Regression Modeling of Time to Event Data. 2nd ed. New York: Wiley-Interscience; 2008.Google ScholarDigital Library
- Schumacher M, Bastert G, Bojar H, Huebner K, Olschewski M, Sauerbrei W, Schmoor C, Beyerle C, Neumann R, Rauschecker H. Randomized 2 x 2 trial evaluating hormonal treatment and the duration of chemotherapy in node-positive breast cancer patients. german breast cancer study group. J Clin Oncol. 1994;12(10):2086–93.Google Scholar
- A. Johnson, T. Pollard, and R. Mark, “Mimic-iii clinical database (version 1.4),” https://doi.org/10.13026/C2XW26, 2016.Google Scholar
- [A. Johnson, T. Pollard, O. Badawi , “eicu collaborative research database (version 2.0),” https://doi.org/10.13026/4mxk-na84, 2019.Google Scholar
- C. Lee, J. Yoon, M. van der Schaar, "Dynamic-DeepHit: A Deep Learning Approach for Dynamic Survival Analysis With Competing Risks Based on Longitudinal Data," IEEE Transactions on Biomedical Engineering (TBME). 2020Google ScholarCross Ref
- Chowdhury MZI, Turin TC. Variable selection strategies and its importance in clinical prediction modelling. Fam Med Com Health 2020;8:e000262. https://doi.org/10.1136/fmch-2019-000262Google Scholar
- Uno H, Cai T, Pencina M J, On the C‐statistics for evaluating overall adequacy of risk prediction procedures with censored survival data[J]. Statistics in medicine, 2011, 30(10): 1105-1117.https://doi.org/10.1002/sim.4154Google Scholar
- Pencina M J, D'Agostino R B. Overall C as a measure of discrimination in survival analysis: model specific population value and confidence interval estimation[J]. Statistics in medicine, 2004, 23(13): 2109-2123.https://doi.org/10.1002/sim.1802Google Scholar
- Cavanaugh J E, Neath A A. The Akaike information criterion: Background, derivation, properties, application, interpretation, and refinements[J]. Wiley Interdisciplinary Reviews: Computational Statistics, 2019, 11(3): e1460.https://doi.org/10.1002/wics.1460Google ScholarDigital Library
- Graf, Erika, "Assessment and comparison of prognostic classification schemes for survival data." Statistics in medicine 18.17‐18 (1999): 2529-2545.Google ScholarCross Ref
- Haider H, Hoehn B, Davis S, Effective Ways to Build and Evaluate Individual Survival Distributions[J]. 2020(85).Google Scholar
- Herrmann M, Probst P, Hornung R, Large-scale benchmark study of survival prediction methods using multi-omics data.[J]. Ludwig-Maximilians-Universität München, 2021(3). https://doi.org/10.1093/BIB/BBAA167Google Scholar
- Qian J, Tanigawa Y, Du W, A fast and scalable framework for large-scale and ultrahigh-dimensional sparse regression with application to the UK Biobank[J]. PLoS Genetics, 2020, 16(10):e1009141. https://doi.org/10.1371/journal.pgen.1009141Google ScholarCross Ref
- Qian J, Du W, Tanigawa Y, A Fast and Flexible Algorithm for Solving the Lasso in Large-scale and Ultrahigh-dimensional Problems. Cold Spring Harbor Laboratory, 2019.https://doi.org/10.1101/630079Google Scholar
- Li R, Chang C, Justesen J M, Fast Lasso method for Large-scale and Ultrahigh-dimensional Cox Model with applications to UK Biobank[J]. Oxford University Press (OUP), 2020.https://doi.org/10.1101/2020.01.20.913194Google ScholarCross Ref
- Bycroft C, Freeman C, Petkova D, The UK Biobank resource with deep phenotyping and genomic data[J]. Nature, 2018, 562(7726): 203-209. https://doi.org/10.1038/s41586-018-0579-zGoogle ScholarCross Ref
- FRIEDMAN, J., HASTIE, T.AND TIBSHIRANI, R. (2010). Regularization paths for generalized linear models via coordinate descent. Journal of Statistical Software 33, 1–22.Google Scholar
- Breheny P, Huang J. Coordinate descent algorithms for nonconvex penalized regression, with applications to biological feature selection[J]. Annals of Applied Stats, 2011, 5(1):232-253.https://doi.org/10.1214/10-AOAS388Google ScholarCross Ref
- Moncada-Torres A, Maaren M C V, Hendriks M P, Explainable machine learning can outperform Cox regression predictions and provide insights in breast cancer survival[J]. Scientific Reports.https://doi.org/10.1038/s41598-021-86327-7Google Scholar
- V. Arya, R.K.E. Bellamy, P.-Y. Chen, A. Dhurandhar, M. Hind, S.C. Hoffman, S. Houde, Q.V. Liao, R. Luss, A. Mojsilovic, S. Mourad, P. Pedemonte, R. Raghavendra, J. Richards, P. Sattigeri, K. Shanmugam, M. Singh, K.R. Varshney, D. Wei, Y. Zhang, One explanation does not fit all: A toolkit and taxonomy of AI explainability techniques, 2019, arXiv:1909.03012.Google Scholar
- R. Guidotti, A. Monreale, S. Ruggieri, F. Turini, F. Giannotti, D. Pedreschi, A survey of methods for explaining black box models, ACM Comput. Surv. 51 (5) (2019) 93.Google ScholarDigital Library
Index Terms
- Feature selection methods for high-dimensional biomedical time-to-event data: a review
Recommendations
Machine Learning for Survival Analysis: A Survey
Survival analysis is a subfield of statistics where the goal is to analyze and model data where the outcome is the time until an event of interest occurs. One of the main challenges in this context is the presence of instances whose event outcomes ...
Survival analysis for high-dimensional, heterogeneous medical data
HighlightsWe propose random survival forests for feature extraction for survival analysis.We formulate two constraints on the neighborhood graph specific to survival analysis.We implement a comparative analysis of 16 feature extraction/selection ...
A Monte-Carlo comparison of several methods for the analysis of censored survival data with treatment and covariate effects
We present the results of a Monte-Carlo study comparing several methods used to test for treatment effect with censored survival data while adjusting for a covariate. The methods studied are based on the Cox proportional hazards model, the Mantel-...
Comments