Estimating a sparse reduction for general regression in high dimensions

Wang, Tao; Chen, Mengjie; Zhao, Hongyu; Zhu, Lixing

doi:10.1007/s11222-016-9714-6

Estimating a sparse reduction for general regression in high dimensions

Published: 21 October 2016

Volume 28, pages 33–46, (2018)
Cite this article

Statistics and Computing Aims and scope Submit manuscript

Tao Wang^1,2,
Mengjie Chen³,
Hongyu Zhao² &
…
Lixing Zhu⁴

880 Accesses
12 Citations
Explore all metrics

Abstract

Although the concept of sufficient dimension reduction that was originally proposed has been there for a long time, studies in the literature have largely focused on properties of estimators of dimension-reduction subspaces in the classical “small p, and large n” setting. Rather than the subspace, this paper considers directly the set of reduced predictors, which we believe are more relevant for subsequent analyses. A principled method is proposed for estimating a sparse reduction, which is based on a new, revised representation of an existing well-known method called the sliced inverse regression. A fast and efficient algorithm is developed for computing the estimator. The asymptotic behavior of the new method is studied when the number of predictors, p, exceeds the sample size, n, providing a guide for choosing the number of sufficient dimension-reduction predictors. Numerical results, including a simulation study and a cancer-drug-sensitivity data analysis, are presented to examine the performance.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Sparse sliced inverse regression for high dimensional data analysis

Article Open access 07 May 2022

A generalized likelihood-based Bayesian approach for scalable joint regression and covariance selection in high dimensions

Article 03 June 2022

A selective overview of feature screening for ultrahigh-dimensional data

Article 22 August 2015

References

Bickel, P.J., Ritov, Y., Tsybakov, A.B.: Simultaneous analysis of Lasso and Dantzig selector. Ann. Stat. 37(4), 1705–1732 (2009)
Article MathSciNet MATH Google Scholar
Bondell, H.D., Li, L.: Shrinkage inverse regression estimation for model-free variable selection. J. R. Stat. Soc. Ser. B 71(1), 287–299 (2009)
Article MathSciNet MATH Google Scholar
Breheny, P., Huang, J.: Group descent algorithms for nonconvex penalized linear and logistic regression models with grouped predictors. Stat. Comput. 25(2), 173–187 (2015)
Article MathSciNet MATH Google Scholar
Bühlmann, P., Van De Geer, S.: Statistics for High-Dimensional Data: Methods, Theory and Applications. Springer, Berlin (2011)
Book MATH Google Scholar
Buldygin, V.V., Kozachenko, Y.V.: Metric Characterization of Random Variables and Random Processes. American Mathematical Society, Providence, RI (2000)
MATH Google Scholar
Bunea, F., She, Y., Wegkamp, M.H.: Joint variable and rank selection for parsimonious estimation of high-dimensional matrices. Ann. Stat. 40(5), 2359–2388 (2012)
Article MathSciNet MATH Google Scholar
Chen, L., Huang, J.Z.: Sparse reduced-rank regression for simultaneous dimension reduction and variable selection. J. Am. Stat. Assoc. 107(500), 1533–1545 (2012)
Article MathSciNet MATH Google Scholar
Chen, X., Zou, C., Cook, R.D.: Coordinate-independent sparse sufficient dimension reduction and variable selection. Ann. Stat. 38(6), 3696–3723 (2010)
Article MathSciNet MATH Google Scholar
Cook, R.D.: Using dimension-reduction subspaces to identify important inputs in models of physical systems. In: Proceedings of the section on Physical and Engineering Sciences, pp. 18–25. American Statistical Association, Alexandria, VA (1994)
Cook, R.D.: Regression Graphics: Ideas for Studying Regressions Through Graphics. Wiley, New York (1998)
Book MATH Google Scholar
Cook, R.D.: Testing predictor contributions in sufficient dimension reduction. Ann. Stat. 32(3), 1062–1092 (2004)
Article MathSciNet MATH Google Scholar
Cook, R.D., Li, B., Chiaromonte, F.: Dimension reduction in regression without matrix inversion. Biometrika 94(3), 569–584 (2007)
Article MathSciNet MATH Google Scholar
Cook, R.D., Weisberg, S.: Comment. J. Am. Stat. Assoc. 86(414), 328–332 (1991)
MATH Google Scholar
Eaton, M.L.: Multivariate Statistics: A Vector Space Approach. Wiley, New York (1983)
MATH Google Scholar
Fan, J., Li, R.: Variable selection via nonconcave penalized likelihood and its oracle properties. J. Am. Stat. Assoc. 96(456), 1348–1360 (2001)
Article MathSciNet MATH Google Scholar
Garnett, M.J., Edelman, E.J., Heidorn, S.J., Greenman, C.D., Dastur, A., Lau, K.W., Greninger, P., Thompson, I.R., Luo, X., Soares, J., et al.: Systematic identification of genomic markers of drug sensitivity in cancer cells. Nature 483(7391), 570–575 (2012)
Article Google Scholar
Gregg, J., Fraizer, G.: Transcriptional regulation of EGR1 by EGF and the ERK signaling pathway in prostate cancer cells. Genes Cancer 2(9), 900–909 (2011)
Article Google Scholar
Harada, T., Morooka, T., Ogawa, S., Nishida, E.: Erk induces p35, a neuron-specific activator of Cdk5, through induction of Egr1. Nat. Cell Biol. 3(5), 453–459 (2001)
Article Google Scholar
Izenman, A.J.: Reduced-rank regression for the multivariate linear model. J. Multivar. Anal. 5(2), 248–264 (1975)
Article MathSciNet MATH Google Scholar
Jiang, B., Liu, J.S.: Variable selection for general index models via sliced inverse regression. Ann. Stat. 42(5), 1751–1786 (2014)
Article MathSciNet MATH Google Scholar
Li, K.-C.: Sliced inverse regression for dimension reduction. J. Am. Stat. Assoc. 86(414), 316–327 (1991)
Article MathSciNet MATH Google Scholar
Li, K.-C.: High dimensional data analysis via the SIR/PHD approach (2000)
Li, L., Li, H.: Dimension reduction methods for microarrays with application to censored survival data. Bioinformatics 20(18), 3406–3412 (2004)
Article Google Scholar
Li, B., Wang, S.: On directional regression for dimension reduction. J. Am. Stat. Assoc. 102(479), 997–1008 (2007)
Article MathSciNet MATH Google Scholar
Li, L., Yin, X.: Sliced inverse regression with regularizations. Biometrics 64(1), 124–131 (2008)
Article MathSciNet MATH Google Scholar
Liu, H., Zhang, J.: Estimation consistency of the group lasso and its applications. In: International Conference on Artificial Intelligence and Statistics pp. 376–383 (2009)
Long, Y.C., Cheng, Z., Copps, K.D., White, M.F.: Insulin receptor substrates Irs1 and Irs2 coordinate skeletal muscle growth and metabolism via the Akt and AMPK pathways. Mol. Cell. Biol. 31(3), 430–441 (2011)
Article Google Scholar
Luo, H., Yanagawa, B., Zhang, J., Luo, Z., Zhang, M., Esfandiarei, M., Carthy, C., Wilson, J.E., Yang, D., McManus, B.M.: Coxsackievirus B3 replication is reduced by inhibition of the extracellular signal-regulated kinase (ERK) signaling pathway. J. Virol. 76(7), 3365–3373 (2002)
Article Google Scholar
Ma, Y., Zhu, L.: A review on dimension reduction. Int. Stat. Rev. 81(1), 134–150 (2013)
Article MathSciNet Google Scholar
Tibshirani, R.: Regression shrinkage and selection via the lasso. J. R. Stat. Soc. Ser. B 58(1), 267–288 (1996)
MathSciNet MATH Google Scholar
Wang, X., Li, G., Hibshoosh, H., Halmos, B.: Phlda1/2 contribute to tumor suppression in breast and lung cancer as downstream targets of oncogenic HER2 signaling. Cancer Res. 72(8 Supplement), 20–20 (2012)
Article Google Scholar
Wang, T., Zhao, H., Chen, M., Zhu, L.: Supplement to “Model-free dimension reduction and variable selection in high-dimensional regression” (2015)
Wu, Y., Li, L.: Asymptotic properties of sufficient dimension reduction with a diverging number of predictors. Statistica Sinica 2011(21), 707–730 (2011)
Article MathSciNet MATH Google Scholar
Yin, X.: Sufficient dimension reduction in regression. In: Shen, X., Cai, T. (eds.) The Analysis of High-Dimensional Data. World Scientific, New Jersey (2010)
Google Scholar
Yin, X., Hilafu, H.: Sequential sufficient dimension reduction for large \(p\), small \(n\) problems. J. R. Stat. Soc. Ser. B 77(4), 879–892 (2015)
Article MathSciNet Google Scholar
Yin, X., Li, B., Cook, R.D.: Successive direction extraction for estimating the central subspace in a multiple-index regression. J. Multivar. Anal. 99(8), 1733–1757 (2008)
Article MathSciNet MATH Google Scholar
Yu, Z., Zhu, L., Peng, H., Zhu, L.: Dimension reduction and predictor selection in semiparametric models. Biometrika 100(3), 641–654 (2013)
Article MathSciNet MATH Google Scholar
Yuan, M., Lin, Y.: Model selection and estimation in regression with grouped variables. J. R. Stat. Soc. Ser. B 68(1), 49–67 (2006)
Article MathSciNet MATH Google Scholar
Zheng, Y., Zhang, C., Croucher, D.R., Soliman, M.A., St-Denis, N., Pasculescu, A., Taylor, L., Tate, S.A., Hardy, W.R., Colwill, K., et al.: Temporal regulation of EGF signalling networks by the scaffold protein Shc1. Nature 499(7457), 166–171 (2013)
Article Google Scholar
Zhong, W., Zeng, P., Ma, P., Liu, J.S., Zhu, Y.: Rsir: regularized sliced inverse regression for motif discovery. Bioinformatics 21(22), 4169–4175 (2005)
Article Google Scholar
Zhu, L., Wang, T., Zhu, L., Ferré, L.: Sufficient dimension reduction through discretization-expectation estimation. Biometrika 97(2), 295–304 (2010)
Zou, H., Hastie, T.: Regularization and variable selection via the elastic net. J. R. Stat. Soc. Ser. B 67(2), 301–320 (2005)
Article MathSciNet MATH Google Scholar

Download references

Acknowledgments

The research of Tao Wang is supported by Natural Science Foundation of China (Grant No. 11601326). Mengjie Chen’s research is supported by NIH R01 CA082659. Hongyu Zhao’s research is supported by NIH R01 GM59507. Lixing Zhu’s research is supported by Natural Science Foundation of China (Grant No. 11671042). The authors thank the Editor, the Associate Editor, and the anonymous reviewers for their helpful comments that have resulted in significant improvements in the article.

Author information

Authors and Affiliations

Department of Bioinformatics and Biostatistics, Shanghai Jiao Tong University, Shanghai, China
Tao Wang
Department of Biostatistics, Yale University, New Haven, CT, USA
Tao Wang & Hongyu Zhao
Department of Biostatistics, University of North Carolina, Chapel Hill, NC, USA
Mengjie Chen
Department of Mathematics, Hong Kong Baptist University, Kowloon Tong, Hong Kong
Lixing Zhu

Authors

Tao Wang
View author publications
You can also search for this author in PubMed Google Scholar
Mengjie Chen
View author publications
You can also search for this author in PubMed Google Scholar
Hongyu Zhao
View author publications
You can also search for this author in PubMed Google Scholar
Lixing Zhu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Lixing Zhu.

Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 109 KB)

Rights and permissions

Reprints and permissions

About this article

Cite this article

Wang, T., Chen, M., Zhao, H. et al. Estimating a sparse reduction for general regression in high dimensions. Stat Comput 28, 33–46 (2018). https://doi.org/10.1007/s11222-016-9714-6

Download citation

Received: 22 February 2016
Accepted: 12 October 2016
Published: 21 October 2016
Issue Date: January 2018
DOI: https://doi.org/10.1007/s11222-016-9714-6

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Estimating a sparse reduction for general regression in high dimensions

Abstract

Access this article

Similar content being viewed by others

Sparse sliced inverse regression for high dimensional data analysis

A generalized likelihood-based Bayesian approach for scalable joint regression and covariance selection in high dimensions

A selective overview of feature screening for ultrahigh-dimensional data

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Electronic supplementary material

Supplementary material 1 (pdf 109 KB)

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Estimating a sparse reduction for general regression in high dimensions

Abstract

Access this article

Similar content being viewed by others

Sparse sliced inverse regression for high dimensional data analysis

A generalized likelihood-based Bayesian approach for scalable joint regression and covariance selection in high dimensions

A selective overview of feature screening for ultrahigh-dimensional data

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Electronic supplementary material

Supplementary material 1 (pdf 109 KB)

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation