Abstract
In this paper, we study the estimation and variable selection of the sufficient dimension reduction space for survival data via a new combination of \(L_1\) penalty and the refined outer product of gradient method (rOPG; Xia et al. in J R Stat Soc Ser B 64:363–410, 2002), called SH-OPG hereafter. SH-OPG can exhaustively estimate the central subspace and select the informative covariates simultaneously; Meanwhile, the estimated directions remain orthogonal automatically after dropping noninformative regressors. The efficiency of SH-OPG is verified through extensive simulation studies and real data analysis.
Similar content being viewed by others
References
Bennett S (1983) Analysis of survival data by the proportional odds model. Stat Med 2:273–277
Breiman L (1995) Better subset regression using the nonnegative garrote. Technometrics 37:373–384
Cook RD (1998) Regression graphics. Wiley, New York
Cook RD, Weisberg S (1991) Sliced inverse regression for dimension reduction: comment. J Am Stat Assoc 86:328–332
Cox RD (1972) Regression models and life-table (with discussion). J R Stat Soc Ser B 34:187–220
Efron B, Hastie T, Johnstone I, Tibshirani R (2004) Least angle regression. Ann Stat 32:407–499
Fan J, Gijbels I (1996) Local polynomial modelling and its applications. Chapman & Hall, New York
Fan J, Li R (2001) Variable selection via nonconcave penalized likelihood and its oracle properties. J Am Stat Assoc 96:1348–1360
Goldberg RJ, Gore JM, Alpert JS, Dalen JE (1986) Recent changes in the attack rates and survival of acute myocardial infarction (1975–1981): the Worcester heart attack study. J Am Med Assoc 255:2774–2779
Hosmer DW, Lemeshow S, May S (2008) Applied survival analysis: regression modeling of time to event data, 2nd edn. Wiley, New York
Jin Z, Lin DY, Ying Z (2006) On least-squares regression with censored data. Biometrica 93:147–161
Jolliffe I (1986) Principal component analysis. Springer, New York
Kalbfleisch J, Prentice R (2002) The statistical analysis of failure time data, 2nd edn. Wiley, New York
Li KC (1991) Sliced inverse regression for dimension reduction. J Am Stat Assoc 86:316–327
Li KC (1992) On principal Hessian directions for data visualization and dimension reduction: another application of Stein’s lemma. J Am Stat Assoc 87:1025–1039
Li L (2007) Sparse sufficient dimension reduction. Biometrica 94:603–613
Li KC, Wang JL, Chen CH (1999) Dimension reduction for censored regression data. Ann Stat 27:1–23
Lu W, Li L (2011) Sufficient dimension reduction for censored regressions. Biometrics 67:513–523
Samarov AM (1993) Exploring regression structure using nonparametric functional estimation. J Am Stat Assoc 88:836–847
Scott DW (1992) Multivariate density estimation: theory, practice and visualization. Wiley, New York
Silverman BW (1986) Density estimation for statistics and data analysis. Chapman & Hall, London
Spierdijk L (2008) Nonparametric conditional hazard rate estimation: a local linear approach. Comput Stat Data Anal 52:2419–2434
Therneau T, Grambsch PM (2000) Modeling survival data: extending the Cox model. Springer, New York
Tibshirani RJ (1996) Regression shrinkage and selection via the LASSO. J R Stat Soc Ser B 58:267–288
Tibshirani R (1997) The Lasso method for variable selection in the Cox model. Stat Med 16:385–395
Wang Q, Yin X (2008) A nonlinear multi-dimensional variable selection method for high dimensional data: sparse MAVE. Comput Stat Data Anal 52:4512–4520
Xia Y (2007) A constructive approach to the estimation of dimension reduction directions. Ann Stat 35:2654–2690
Xia Y, Tong H, Li WK, Zhu L (2002) An adaptive estimation of dimension reduction space (with discussion). J R Stat Soc Ser B 64:363–410
Xia Y, Zhang D, Xu J (2010) Dimension reduction and semiparametric estimation of survival models. J Am Stat Assoc 105:278–290
Ye G, Xie X (2012) Learning sparse gradients for variable selection and dimension reduction. Mach Learn 87:303–355
Zeng D, Lin DY (2007) Maximum likelihood estimation in semiparametric models with censored data (with discussion). J R Stat Soc Ser B 69:507–564
Zhang W, Steele F (2004) A semiparametric multilevel survival model. J R Stat Soc Ser B 53:387–404
Zou H, Hastie T (2005) Regularization and variable selection via the elastic net. J R Stat Soc Ser B 67: 301–320
Zou H, Hastie T, Tibshirani R (2006) Sparse principal component analysis. J Comput Graph Stat 15:265–286
Acknowledgments
The authors are very grateful to the Editor, the AE, two anonymous referees and professor Y. Xia for helpful comments and constructive advices.
Author information
Authors and Affiliations
Corresponding author
Additional information
This work is supported by the National Natural Science Foundation of China, grant number 11071113.
Rights and permissions
About this article
Cite this article
Yan, C., Zhang, D. Sparse dimension reduction for survival data. Comput Stat 28, 1835–1852 (2013). https://doi.org/10.1007/s00180-012-0383-4
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00180-012-0383-4