Skip to main content
Log in

A group VISA algorithm for variable selection

  • Published:
Statistical Methods & Applications Aims and scope Submit manuscript

Abstract

We consider the problem of selecting grouped variables in a linear regression model based on penalized least squares. The group-Lasso and the group-Lars procedures are designed for automatically performing both the shrinkage and the selection of important groups of variables. However, since the same tuning parameter is used (as in Lasso or Lars ) for both group variable selection and shrinkage coefficients, it can lead to over shrinkage the significant groups of variables or inclusion of many irrelevant groups of predictors. This situation occurs when the true number of non-zero groups of coefficients is small relative to the number \(p\) of variables. We introduce a novel sparse regression method, called the Group-VISA (GVISA), which extends the VISA effect to grouped variables. It combines the idea of VISA algorithm which avoids the over shrinkage problem of regression coefficients and the idea of the GLars-type estimator which shrinks and selects the members of the group together. Hence, GVISA is able to select a sparse group model by avoiding the over shrinkage of GLars-type estimator. We distinguish two variants of the GVISA algorithm, each one is associated with each version of GLars (I and II). Moreover, we provide a path algorithm, similar to GLars, for efficiently computing the entire sample path of GVISA coefficients. We establish a theoretical property on sparsity inequality of GVISA estimator that is a non-asymptotic bound on the estimation error. A detailed simulation study in small and high dimensional settings is performed, which illustrates the advantages of the new approach in relation to several other possible methods. Finally, we apply GVISA on two real data sets.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2

Similar content being viewed by others

References

  • Bach P (2008) Consistency of the group Lasso and multiple kernel learning. J Mach Learn Res 9:1179–1225

    MATH  MathSciNet  Google Scholar 

  • Bakin S (1999) Adaptive regression and variable selection in data mining problems. Ph. D. thesis, Australian National University, Canberra

  • Bühlmann P, van De Geer S (2011) Statistics for high dimensional data: methods, theory and applications. Springer, Berlin

  • Breheny P, Huang J (2014) Group descent algorithms for nonconvex penalized linear and logistic regression models with grouped predictors. Stat Comput. doi:10.1007/s11222-013-9424-2

  • Candes E, Tao T (2007) The Dantzig selector: statistical estimation when \(p\) is much large than \(n\). Ann Stat 35:2313–2351

    Article  MATH  MathSciNet  Google Scholar 

  • Duarte M, Bajwa W, Calderbank R (2011) The performance of group Lasso linear regression of grouped variables. Technical Report TR-2010-10, Department of Computer Sciences, Duck University

  • Efron B, Hastie T, Johnstone I, Tibshirani R (2004) Least angle regression. Ann Stat 32:407–499

    Article  MATH  MathSciNet  Google Scholar 

  • Fan J, Li R (2001) Variable selection via nonconcave penalized likelihood and its oracle properties. J Am Stat Assoc 96:1348–1360

    Article  MATH  MathSciNet  Google Scholar 

  • Friedman J, Hastie T, Tibshirani (2010) A note on the group lasso and a sparse group lasso. Preprint

  • Forgel R, Drton M (2010) Exact block-wise optimization in group Lasso for linear regression. Arxiv preprint

  • Grandvalet Y, Canu S (1999) Outcomes of the equivalence of adaptive ridge with least absolute shrinkage. In: Advances in neural information processing systems 11 (NIPS 1998), pp 445–451

  • Jacob L, Obozinski G, Vert J (2009) Group Lasso with overlap and graph Lasso. In: Proceedings of the 26th annual international conference on machine learning, pp 433–440

  • Huang J, Ma S, Xie H, Zhang C-H (2009) A group bridge approach for variable selection. Biometrika 96:339–355

    Article  MATH  MathSciNet  Google Scholar 

  • Lin Y, Zhang HH (2003) Component selection and smoothing in smoothing spline analysis of variance models. Technical Report 1072. Department of Statistics, University of Wisconsin, Madison

  • Liu H, Zhang J, Jiang X, Liu J (2010) The group Dantzig selector. J Mach Learn Res Proc Track 9:461–468

  • Lounici K, Pontil M, van de Geer S, Tsybakov A (2011) Oracle inequalities and optimal inference under group sparsity. Ann Stat 39:2164–2204

    Article  MATH  Google Scholar 

  • Meier L, van De Geer S (2008) The group for logistic regression. J R Stat Soc Ser B 70:53–71

    Article  MATH  Google Scholar 

  • Meinshausen N (2007) Relaxed lasso. Comput Stat Data Anal 52:374–393

    Article  MATH  MathSciNet  Google Scholar 

  • Meinshausen P, Bühlmann P (2006) High-dimensional graphs and variable selection with the Lasso. Ann Stat 34:1049–1579

    Article  Google Scholar 

  • Mkhadri A, Ouhourane M (2013) An extended variable inclusion and shrinkage algorithm for correlated variables. Comput Stat Data Anal 57:631–644

    Article  MathSciNet  Google Scholar 

  • Nardi Y, Rinaldo A (2008) On the asymptotic properties of the group Lasso estimator for linear models. Electron J Stat 2:605–633

    Article  MATH  MathSciNet  Google Scholar 

  • Negahban S, Ravikumar P, Wainwrigt M, Yu B (2012) A unified framework for high-dimensional analysis of M-estimators with decomposable regularizers. Stat Sci 27:538–557

    Article  Google Scholar 

  • Park M, Hastie T (2007) An \(\ell _1\) regularization path algorithm for generalized linear models. J R Stat Soc Ser B 69:659–677

    Article  MathSciNet  Google Scholar 

  • Radchenko P, James GM (2008) Variable inclusion and shrinkage algorithms. J Am Stat Assoc 103(483):1304–1315

    Article  MATH  MathSciNet  Google Scholar 

  • Ravikumar P, Liu H, Lafferty J, Wasserman L (2007) Spam: Sparse additive models. Adv Neural Inf Process Syst 20:1201–1208

  • Roth V, Fisher B (2008) The group-Lasso for generalized linear model: uniqueness of solutions and efficient algorithms. In: ICML08: proceedings of 25th international conference on machine learning

  • She Y (2012) An iterative algorithm for fitting nonconvex penalized generalized linear models with grouped predictors. Comput Stat Data Anal 56(10):2976–2990

    Article  MATH  MathSciNet  Google Scholar 

  • Tibshirani R (1996) Regression shrinkage and selection via the Lasso. J R Stat Soc B 58:267–288

    MATH  MathSciNet  Google Scholar 

  • Wei F, Zhu H (2012) Group coordinate descent algorithms for nonconvex penalized regression. Comput Stat Data Anal 56:316–326

    Article  MATH  MathSciNet  Google Scholar 

  • Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. J R Stat Soc Ser B 68:49–67

    Article  MATH  MathSciNet  Google Scholar 

  • Zhang CH (2010) Nearly unbiased variable selection under minimax concave penalty. Annals Stat 38(2):894–942

  • Zou H, Hastie T (2005) Regularization and variable selection via the elastic-net. J R Stat Soc Ser B 67:301–320

    Article  MATH  MathSciNet  Google Scholar 

Download references

Acknowledgments

We warmly thank the anonymuous reviewers and Gerhard Tutz for their helpful comments and careful reading of the previous versions of our paper.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Abdallah Mkhadri.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Mkhadri, A., Ouhourane, M. A group VISA algorithm for variable selection. Stat Methods Appl 24, 41–60 (2015). https://doi.org/10.1007/s10260-014-0281-8

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10260-014-0281-8

Keywords

Navigation