Skip to main content
Log in

Ultra-high dimensional variable screening via Gram–Schmidt orthogonalization

  • Original Paper
  • Published:
Computational Statistics Aims and scope Submit manuscript

Abstract

Independence screening procedure plays a vital role in variable selection when the number of variables is massive. However, high dimensionality of the data may bring in many challenges, such as multicollinearity or high correlation (possibly spurious) between the covariates, which results in marginal correlation being unreliable as a measure of association between the covariates and the response. We propose a novel and simple screening procedure called Gram–Schmidt screening (GSS) by integrating the classical Gram–Schmidt orthogonalization and the sure independence screening technique, which takes into account high correlations between the covariates in a data-driven way. GSS could successfully discriminate between the relevant and the irrelevant variables to achieve a high true positive rate without including many irrelevant and redundant variables, which offers a new perspective for screening method when the covariates are highly correlated. The practical performance of GSS was shown by comparative simulation studies and analysis of two real datasets.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1

Similar content being viewed by others

References

  • Björck Å (1994) Numerics of Gram–Schmidt orthogonalization. Linear Algebra Appl 197(198):297–316

    MathSciNet  MATH  Google Scholar 

  • Candès E, Tao T (2007) The Dantzig selector: statistical estimation when \(p\) is much larger than \(n\). Ann Stat 35(6):2313–2351

    MathSciNet  MATH  Google Scholar 

  • Chen J, Chen Z (2008) Extended Bayesian information criteria for model selection with large model spaces. Biometrika 95(3):759–771

    MathSciNet  MATH  Google Scholar 

  • Chen S, Billings SA, Luo W (1989) Orthogonal least squares methods and their application to non-linear system identification. Int J Control 50(5):1873–1896

    MATH  Google Scholar 

  • Chen S, Cowan CF, Grant PM (1991) Orthogonal least squares learning algorithm for radial basis function networks. IEEE Trans Neural Netw 2(2):302–309

    Google Scholar 

  • Cho H, Fryzlewicz P (2012) High dimensional variable selection via tilting. J R Stat Soc Ser B (Stat Methodol) 74(3):593–622

    MathSciNet  MATH  Google Scholar 

  • Chong I-G, Jun C-H (2005) Performance of some variable selection methods when multicollinearity is present. Chemom Intell Lab Syst 78(1–2):103–112

    Google Scholar 

  • Ding C, Peng H (2005) Minimum redundancy feature selection from microarray gene expression data. J Bioinform Comput Biol 3(02):185–205

    Google Scholar 

  • Fan J, Li R (2001) Variable selection via nonconcave penalized likelihood and its oracle properties. J Am Stat Assoc 96(456):1348–1360

    MathSciNet  MATH  Google Scholar 

  • Fan J, Lv J (2008) Sure independence screening for ultrahigh dimensional feature space. J R Stat Soc Ser B (Stat Methodol) 70(5):849–911

    MathSciNet  MATH  Google Scholar 

  • Fan J, Song R (2010) Sure independence screening in generalized linear models with NP-dimensionality. Ann Stat 38(6):3567–3604

    MathSciNet  MATH  Google Scholar 

  • Fan J, Samworth R, Wu Y (2009) Ultrahigh dimensional feature selection: beyond the linear model. J Mach Learn Res 10(9):2013–2038

    MathSciNet  MATH  Google Scholar 

  • Fisher R (1921) On the probable error of a coefficient of correlation deduced from a small sample. Metron 1(4):3–32

    Google Scholar 

  • Huang J, Ma S, Zhang CH (2006) Adaptive LASSO for sparse high-dimensional regression. Stat Sin 18(4):1603–1618

    MathSciNet  MATH  Google Scholar 

  • Ing C-K, Lai TL (2011) A stepwise regression method and consistent model selection for high-dimensional sparse linear models. Stat Sin 21(4):1473–1513

    MathSciNet  MATH  Google Scholar 

  • Korenberg M, Billings S, Liu Y, Mcilroy P (1988) Orthogonal parameter estimation algorithm for non-linear stochastic systems. Int J Control 48(1):193–210

    MATH  Google Scholar 

  • Leon SJ, Björck Å, Gander W (2013) Gram–Schmidt orthogonalization: 100 years and more. Numer Linear Algebra Appl 20(3):492–532

    MathSciNet  MATH  Google Scholar 

  • Li G, Peng H, Zhang J, Zhu L (2012a) Robust rank correlation based screening. Ann Stat 40(3):1846–1877

    MathSciNet  MATH  Google Scholar 

  • Li R, Zhong W, Zhu L (2012b) Feature screening via distance correlation learning. J Am Stat Assoc 107(499):1129–1139

    MathSciNet  MATH  Google Scholar 

  • Mangold WD, Bean L, Adams D (2003) The impact of intercollegiate athletics on graduation rates among major NCAA division I universities: implications for college persistence theory and practice. J High Educ 74(5):540–562

    Google Scholar 

  • Oussar Y, Dreyfus G (2000) Initialization by selection for wavelet network training. Neurocomputing 34(1):131–143

    MATH  Google Scholar 

  • Scheetz TE, Stone EM (2006) Regulation of gene expression in the mammalian eye and its relevance to eye disease. Proc Natl Acad Sci USA 103(39):14429–14434

    Google Scholar 

  • Segal MR, Dahlquist KD, Conklin BR (2003) Regression approaches for microarray data analysis. J Comput Biol 10(6):961–980

    Google Scholar 

  • Stoppiglia H, Dreyfus G, Dubois R, Oussar Y (2003) Ranking a random feature for variable and feature selection. J Mach Learn Res 3(3):1399–1414

    MATH  Google Scholar 

  • Tibshirani R (1996) Regression shrinkage and selection via the LASSO. J R Stat Soc Ser B (Methodol) 58(1):267–288

    MathSciNet  MATH  Google Scholar 

  • Wang H (2009) Forward regression for ultra-high dimensional variable screening. J Am Stat Assoc 104(488):1512–1524

    MathSciNet  MATH  Google Scholar 

  • Wang S, Xiang L (2017) Two-layer EM algorithm for ALD mixture regression models: a new solution to composite quantile regression. Comput Stat Data Anal 115(11):136–154

    MathSciNet  MATH  Google Scholar 

  • Wang H, Li B, Leng C (2009) Shrinkage tuning parameter selection with a diverging number of parameters. J R Stat Soc Ser B (Stat Methodol) 71(3):671–683

    MathSciNet  MATH  Google Scholar 

  • Zhao Y-P, Li Z-Q, Xi P-P, Liang D, Sun L, Chen T-H (2017) Gram–Schmidt process based incremental extreme learning machine. Neurocomputing 241(7):1–17

    Google Scholar 

  • Zhu L-P, Li L, Li R, Zhu L-X (2011) Model-free feature screening for ultrahigh-dimensional data. J Am Stat Assoc 106(496):1464–1475

    MathSciNet  MATH  Google Scholar 

  • Zou H, Hastie T (2005) Regularization and variable selection via the elastic net. J R Stat Soc Ser B (Stat Methodol) 67(2):301–320

    MathSciNet  MATH  Google Scholar 

Download references

Acknowledgements

The authors want to extend grateful thanks to the Editors and the reviewers whose comments have greatly improved the scope and presentation of the paper, and to Prof. Yuhong Yang and Yingying Ma for their valuable suggestions. This work was supported by the National Science Foundation of China (Grant Nos. 71420107025, 11701023).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Shanshan Wang.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Electronic supplementary material

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Wang, H., Liu, R., Wang, S. et al. Ultra-high dimensional variable screening via Gram–Schmidt orthogonalization. Comput Stat 35, 1153–1170 (2020). https://doi.org/10.1007/s00180-020-00963-7

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00180-020-00963-7

Keywords

Navigation