Skip to main content

Robust Model Selection with LARS Based on S-estimators

  • Conference paper
  • First Online:
Proceedings of COMPSTAT'2010

Abstract

We consider the problem of selecting a parsimonious subset of explanatory variables from a potentially large collection of covariates. We are concerned with the case when data quality may be unreliable (e.g. there might be outliers among the observations). When the number of available covariates is moderately large, fitting all possible subsets is not a feasible option. Sequential methods like forward or backward selection are generally “greedy” and may fail to include important predictors when these are correlated. To avoid this problem Efron et al. (2004) proposed the Least Angle Regression algorithm to produce an ordered list of the available covariates (sequencing) according to their relevance. We introduce outlier robust versions of the LARS algorithm based on S-estimators for regression (Rousseeuw and Yohai (1984)). This algorithm is computationally efficient and suitable even when the number of variables exceeds the sample size. Simulation studies show that it is also robust to the presence of outliers in the data and compares favourably to previous proposals in the literature.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Similar content being viewed by others

References

  • AGOSTINELLI, C. (2002a): Robust model selection in regression via weighted likelihood methodology. Statistics and Probability Letters, 56 289-300.

    Article  MathSciNet  MATH  Google Scholar 

  • AGOSTINELLI, C. (2002b): Robust stepwise regression. Journal of Applied Statistics, 29(6) 825-840.

    Article  MathSciNet  MATH  Google Scholar 

  • AGOSTINELLI, C. and MARKATOU, M. (2005): M. Robust model selection by cross-validation via weighted likelihood. Unpublished manuscript.

    Google Scholar 

  • AKAIKE, H. (1970): Statistical predictor identification. Annals of the Institute of Statistical Mathematics, 22 203-217.

    Article  MathSciNet  MATH  Google Scholar 

  • EFRON, B., HASTIE, T., JOHNSTONE, I. and TIBSHIRANI, R. (2004): Least angle regression. The Annals of Statistics 32(2), 407-499.

    Article  MathSciNet  MATH  Google Scholar 

  • HAMPEL, F.R. (1983): Some aspects of model choice in robust statistics. In: Proceedings of the 44th Session of the ISI, volume 2, 767-771. Madrid.

    Google Scholar 

  • HASTIE, T., TIBSHIRANI, R. and FRIEDMAN, J. (2001): The Elements of Statistical Learning. Springer-Verlag, New York.

    MATH  Google Scholar 

  • KHAN, J.A., VAN AELST, S., and ZAMAR, R.H. (2007a): Building a robust linear model with forward selection and stepwise procedures. Computational Statistics and Data Analysis 52, 239-248.

    Article  MathSciNet  MATH  Google Scholar 

  • KHAN, J.A., VAN AELST, S., and ZAMAR, R.H. (2007b): Robust Linear Model Selection Based on Least Angle Regression. Journal of the American Statistical Association 102, 1289-1299.

    Article  MathSciNet  MATH  Google Scholar 

  • MALLOWS, C.L. (1973): Some comments on C p . Technometrics 15, 661-675.

    Article  MATH  Google Scholar 

  • MARONNA, R.A., MARTIN, D.R. and YOHAI, V.J. (2006): Robust Statistics: Theory and Methods. Wiley, Ney York.

    Google Scholar 

  • McCANN, L. and WELSCH, R.E. (2007): Robust variable selection using least angle regression and elemental set sampling. Computational Statistical and Data Analysis 52, 249-257.

    Article  MathSciNet  MATH  Google Scholar 

  • MILLER, A.J. (2002): Subset selection in regression. Chapman-Hall, New York.

    Book  MATH  Google Scholar 

  • MORGENTHALER, S., WELSCH, R.E. and ZENIDE, A. (2003): Algorithms for robust model selection in linear regression. In: M. Hubert, G. Pison, A. Struyf and S. Van Aelst (Eds.): Theory and Applications of Recent Robust Methods. Brikhäuser-Verlag, Basel, 195-206.

    Google Scholar 

  • MÜLLER, S. and WELSH, A. H. (2005): Outlier robust model selection in linear regression. Journal of the American Statistical Association 100, 1297-1310.

    Article  MathSciNet  MATH  Google Scholar 

  • QIAN, G. and KÜNSCH, H.R. (1998): On model selection via stochastic complexity in robust linear regression. Journal of Statistical Planning and Inference 75, 91-116.

    Article  MathSciNet  MATH  Google Scholar 

  • RONCHETTI, E. (1985): Robust model selection in regression. Statistics and Probability Letters 3, 21-23.

    Article  MathSciNet  Google Scholar 

  • RONCHETTI, E. (1997): Robustness aspects of model choice. Statistica Sinica 7, 327-338.

    MathSciNet  MATH  Google Scholar 

  • RONCHETTI, E. and STAUDTE, R.G. (1994): A robust version of Mallows’ C p . Journal of the American Statistical Association 89, 550-559.

    Article  MathSciNet  MATH  Google Scholar 

  • RONCHETTI, E., FIELD, C. and BLANCHARD, W. (1997): Robust linear model selection by cross-validation. Journal of the American Statistical Association 92, 1017-1023.

    Article  MathSciNet  MATH  Google Scholar 

  • ROUSSEEUW, P.J. and YOHAI, V.J. (1984). Robust regression by means of S-estimators. In: J. Franke, W. Hardle and D. Martin (Eds.): Robust and Nonlinear Time Series, Lecture Notes in Statistics 26. Springer-Verlag, Berlin, 256-272.

    Google Scholar 

  • SALIBIAN-BARRERA, M. and VAN AELST, S. (2008): Robust model selection using fast and robust bootstrap. Computational Statistics and Data Analysis 52 5121-5135.

    Article  MathSciNet  MATH  Google Scholar 

  • SALIBIAN-BARRERA, M. and ZAMAR, R.H. (2002): Bootstrapping robust estimates of regression. The Annals of Statistics 30, 556-582.

    Article  MathSciNet  MATH  Google Scholar 

  • SCHWARTZ, G. (1978): Estimating the dimensions of a model. The Annals of Statistics 6, 461-464.

    Article  MathSciNet  Google Scholar 

  • SOMMER, S. and STAUDTE, R.G. (1995): Robust variable selection in regression in the presence of outliers and leverage points. Australian Journal of Statistics 37, 323-336.

    Article  MathSciNet  MATH  Google Scholar 

  • TIBSHIRANI, R. (1996): Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society, Series B: Methodological 58, 267-288.

    MathSciNet  MATH  Google Scholar 

  • WEISBERG, S. (1985): Applied linear regression. Wiley, New York.

    MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Claudio Agostinelli .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2010 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Agostinelli, C., Salibian-Barrera, M. (2010). Robust Model Selection with LARS Based on S-estimators. In: Lechevallier, Y., Saporta, G. (eds) Proceedings of COMPSTAT'2010. Physica-Verlag HD. https://doi.org/10.1007/978-3-7908-2604-3_6

Download citation

Publish with us

Policies and ethics