Skip to main content
Log in

Estimating a sparse reduction for general regression in high dimensions

  • Published:
Statistics and Computing Aims and scope Submit manuscript

Abstract

Although the concept of sufficient dimension reduction that was originally proposed has been there for a long time, studies in the literature have largely focused on properties of estimators of dimension-reduction subspaces in the classical “small p, and large n” setting. Rather than the subspace, this paper considers directly the set of reduced predictors, which we believe are more relevant for subsequent analyses. A principled method is proposed for estimating a sparse reduction, which is based on a new, revised representation of an existing well-known method called the sliced inverse regression. A fast and efficient algorithm is developed for computing the estimator. The asymptotic behavior of the new method is studied when the number of predictors, p, exceeds the sample size, n, providing a guide for choosing the number of sufficient dimension-reduction predictors. Numerical results, including a simulation study and a cancer-drug-sensitivity data analysis, are presented to examine the performance.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Bickel, P.J., Ritov, Y., Tsybakov, A.B.: Simultaneous analysis of Lasso and Dantzig selector. Ann. Stat. 37(4), 1705–1732 (2009)

    Article  MathSciNet  MATH  Google Scholar 

  • Bondell, H.D., Li, L.: Shrinkage inverse regression estimation for model-free variable selection. J. R. Stat. Soc. Ser. B 71(1), 287–299 (2009)

    Article  MathSciNet  MATH  Google Scholar 

  • Breheny, P., Huang, J.: Group descent algorithms for nonconvex penalized linear and logistic regression models with grouped predictors. Stat. Comput. 25(2), 173–187 (2015)

    Article  MathSciNet  MATH  Google Scholar 

  • Bühlmann, P., Van De Geer, S.: Statistics for High-Dimensional Data: Methods, Theory and Applications. Springer, Berlin (2011)

    Book  MATH  Google Scholar 

  • Buldygin, V.V., Kozachenko, Y.V.: Metric Characterization of Random Variables and Random Processes. American Mathematical Society, Providence, RI (2000)

    MATH  Google Scholar 

  • Bunea, F., She, Y., Wegkamp, M.H.: Joint variable and rank selection for parsimonious estimation of high-dimensional matrices. Ann. Stat. 40(5), 2359–2388 (2012)

    Article  MathSciNet  MATH  Google Scholar 

  • Chen, L., Huang, J.Z.: Sparse reduced-rank regression for simultaneous dimension reduction and variable selection. J. Am. Stat. Assoc. 107(500), 1533–1545 (2012)

    Article  MathSciNet  MATH  Google Scholar 

  • Chen, X., Zou, C., Cook, R.D.: Coordinate-independent sparse sufficient dimension reduction and variable selection. Ann. Stat. 38(6), 3696–3723 (2010)

    Article  MathSciNet  MATH  Google Scholar 

  • Cook, R.D.: Using dimension-reduction subspaces to identify important inputs in models of physical systems. In: Proceedings of the section on Physical and Engineering Sciences, pp. 18–25. American Statistical Association, Alexandria, VA (1994)

  • Cook, R.D.: Regression Graphics: Ideas for Studying Regressions Through Graphics. Wiley, New York (1998)

    Book  MATH  Google Scholar 

  • Cook, R.D.: Testing predictor contributions in sufficient dimension reduction. Ann. Stat. 32(3), 1062–1092 (2004)

    Article  MathSciNet  MATH  Google Scholar 

  • Cook, R.D., Li, B., Chiaromonte, F.: Dimension reduction in regression without matrix inversion. Biometrika 94(3), 569–584 (2007)

    Article  MathSciNet  MATH  Google Scholar 

  • Cook, R.D., Weisberg, S.: Comment. J. Am. Stat. Assoc. 86(414), 328–332 (1991)

    MATH  Google Scholar 

  • Eaton, M.L.: Multivariate Statistics: A Vector Space Approach. Wiley, New York (1983)

    MATH  Google Scholar 

  • Fan, J., Li, R.: Variable selection via nonconcave penalized likelihood and its oracle properties. J. Am. Stat. Assoc. 96(456), 1348–1360 (2001)

    Article  MathSciNet  MATH  Google Scholar 

  • Garnett, M.J., Edelman, E.J., Heidorn, S.J., Greenman, C.D., Dastur, A., Lau, K.W., Greninger, P., Thompson, I.R., Luo, X., Soares, J., et al.: Systematic identification of genomic markers of drug sensitivity in cancer cells. Nature 483(7391), 570–575 (2012)

    Article  Google Scholar 

  • Gregg, J., Fraizer, G.: Transcriptional regulation of EGR1 by EGF and the ERK signaling pathway in prostate cancer cells. Genes Cancer 2(9), 900–909 (2011)

    Article  Google Scholar 

  • Harada, T., Morooka, T., Ogawa, S., Nishida, E.: Erk induces p35, a neuron-specific activator of Cdk5, through induction of Egr1. Nat. Cell Biol. 3(5), 453–459 (2001)

    Article  Google Scholar 

  • Izenman, A.J.: Reduced-rank regression for the multivariate linear model. J. Multivar. Anal. 5(2), 248–264 (1975)

    Article  MathSciNet  MATH  Google Scholar 

  • Jiang, B., Liu, J.S.: Variable selection for general index models via sliced inverse regression. Ann. Stat. 42(5), 1751–1786 (2014)

    Article  MathSciNet  MATH  Google Scholar 

  • Li, K.-C.: Sliced inverse regression for dimension reduction. J. Am. Stat. Assoc. 86(414), 316–327 (1991)

    Article  MathSciNet  MATH  Google Scholar 

  • Li, K.-C.: High dimensional data analysis via the SIR/PHD approach (2000)

  • Li, L., Li, H.: Dimension reduction methods for microarrays with application to censored survival data. Bioinformatics 20(18), 3406–3412 (2004)

    Article  Google Scholar 

  • Li, B., Wang, S.: On directional regression for dimension reduction. J. Am. Stat. Assoc. 102(479), 997–1008 (2007)

    Article  MathSciNet  MATH  Google Scholar 

  • Li, L., Yin, X.: Sliced inverse regression with regularizations. Biometrics 64(1), 124–131 (2008)

    Article  MathSciNet  MATH  Google Scholar 

  • Liu, H., Zhang, J.: Estimation consistency of the group lasso and its applications. In: International Conference on Artificial Intelligence and Statistics pp. 376–383 (2009)

  • Long, Y.C., Cheng, Z., Copps, K.D., White, M.F.: Insulin receptor substrates Irs1 and Irs2 coordinate skeletal muscle growth and metabolism via the Akt and AMPK pathways. Mol. Cell. Biol. 31(3), 430–441 (2011)

    Article  Google Scholar 

  • Luo, H., Yanagawa, B., Zhang, J., Luo, Z., Zhang, M., Esfandiarei, M., Carthy, C., Wilson, J.E., Yang, D., McManus, B.M.: Coxsackievirus B3 replication is reduced by inhibition of the extracellular signal-regulated kinase (ERK) signaling pathway. J. Virol. 76(7), 3365–3373 (2002)

    Article  Google Scholar 

  • Ma, Y., Zhu, L.: A review on dimension reduction. Int. Stat. Rev. 81(1), 134–150 (2013)

    Article  MathSciNet  Google Scholar 

  • Tibshirani, R.: Regression shrinkage and selection via the lasso. J. R. Stat. Soc. Ser. B 58(1), 267–288 (1996)

    MathSciNet  MATH  Google Scholar 

  • Wang, X., Li, G., Hibshoosh, H., Halmos, B.: Phlda1/2 contribute to tumor suppression in breast and lung cancer as downstream targets of oncogenic HER2 signaling. Cancer Res. 72(8 Supplement), 20–20 (2012)

    Article  Google Scholar 

  • Wang, T., Zhao, H., Chen, M., Zhu, L.: Supplement to “Model-free dimension reduction and variable selection in high-dimensional regression” (2015)

  • Wu, Y., Li, L.: Asymptotic properties of sufficient dimension reduction with a diverging number of predictors. Statistica Sinica 2011(21), 707–730 (2011)

    Article  MathSciNet  MATH  Google Scholar 

  • Yin, X.: Sufficient dimension reduction in regression. In: Shen, X., Cai, T. (eds.) The Analysis of High-Dimensional Data. World Scientific, New Jersey (2010)

    Google Scholar 

  • Yin, X., Hilafu, H.: Sequential sufficient dimension reduction for large \(p\), small \(n\) problems. J. R. Stat. Soc. Ser. B 77(4), 879–892 (2015)

    Article  MathSciNet  Google Scholar 

  • Yin, X., Li, B., Cook, R.D.: Successive direction extraction for estimating the central subspace in a multiple-index regression. J. Multivar. Anal. 99(8), 1733–1757 (2008)

    Article  MathSciNet  MATH  Google Scholar 

  • Yu, Z., Zhu, L., Peng, H., Zhu, L.: Dimension reduction and predictor selection in semiparametric models. Biometrika 100(3), 641–654 (2013)

    Article  MathSciNet  MATH  Google Scholar 

  • Yuan, M., Lin, Y.: Model selection and estimation in regression with grouped variables. J. R. Stat. Soc. Ser. B 68(1), 49–67 (2006)

    Article  MathSciNet  MATH  Google Scholar 

  • Zheng, Y., Zhang, C., Croucher, D.R., Soliman, M.A., St-Denis, N., Pasculescu, A., Taylor, L., Tate, S.A., Hardy, W.R., Colwill, K., et al.: Temporal regulation of EGF signalling networks by the scaffold protein Shc1. Nature 499(7457), 166–171 (2013)

    Article  Google Scholar 

  • Zhong, W., Zeng, P., Ma, P., Liu, J.S., Zhu, Y.: Rsir: regularized sliced inverse regression for motif discovery. Bioinformatics 21(22), 4169–4175 (2005)

    Article  Google Scholar 

  • Zhu, L., Wang, T., Zhu, L., Ferré, L.: Sufficient dimension reduction through discretization-expectation estimation. Biometrika 97(2), 295–304 (2010)

  • Zou, H., Hastie, T.: Regularization and variable selection via the elastic net. J. R. Stat. Soc. Ser. B 67(2), 301–320 (2005)

    Article  MathSciNet  MATH  Google Scholar 

Download references

Acknowledgments

The research of Tao Wang is supported by Natural Science Foundation of China (Grant No. 11601326). Mengjie Chen’s research is supported by NIH R01 CA082659. Hongyu Zhao’s research is supported by NIH R01 GM59507. Lixing Zhu’s research is supported by Natural Science Foundation of China (Grant No. 11671042). The authors thank the Editor, the Associate Editor, and the anonymous reviewers for their helpful comments that have resulted in significant improvements in the article.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Lixing Zhu.

Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 109 KB)

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Wang, T., Chen, M., Zhao, H. et al. Estimating a sparse reduction for general regression in high dimensions. Stat Comput 28, 33–46 (2018). https://doi.org/10.1007/s11222-016-9714-6

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11222-016-9714-6

Keywords

Navigation