Skip to main content
Log in

A graphical tool for selecting the number of slices and the dimension of the model in SIR and SAVE approaches

  • Original Paper
  • Published:
Computational Statistics Aims and scope Submit manuscript

Abstract

Sliced inverse regression (SIR) and related methods were introduced in order to reduce the dimensionality of regression problems. In general semiparametric regression framework, these methods determine linear combinations of a set of explanatory variables X related to the response variable Y, without losing information on the conditional distribution of Y given X. They are based on a “slicing step” in the population and sample versions. They are sensitive to the choice of the number H of slices, and this is particularly true for SIR-II and SAVE methods. At the moment there are no theoretical results nor practical techniques which allows the user to choose an appropriate number of slices. In this paper, we propose an approach based on the quality of the estimation of the effective dimension reduction (EDR) space: the square trace correlation between the true EDR space and its estimate can be used as goodness of estimation. We introduce a naïve bootstrap estimation of the square trace correlation criterion to allow selection of an “optimal” number of slices. Moreover, this criterion can also simultaneously select the corresponding suitable dimension K (number of the linear combination of X). From a practical point of view, the choice of these two parameters H and K is essential. We propose a 3D-graphical tool, implemented in R, which can be useful to select the suitable couple (H, K). An R package named “edrGraphicalTools” has been developed. In this article, we focus on the SIR-I, SIR-II and SAVE methods. Moreover the proposed criterion can be use to determine which method seems to be efficient to recover the EDR space, that is the structure between Y and X. We indicate how the proposed criterion can be used in practice. A simulation study is performed to illustrate the behavior of this approach and the need for selecting properly the number H of slices and the dimension K. A short real-data example is also provided.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Similar content being viewed by others

References

  • Aragon Y, Saracco J (1997) Sliced inverse regression (SIR): an appraisal of small sample alternatives to slicing. Comput Stat 12: 109–130

    MathSciNet  Google Scholar 

  • Breiman L, Friedman JH (1985) Estimating optimal transformations for multiple regression and correlation (with discussion). J Am Stat Assoc 80: 580–619

    Article  MathSciNet  MATH  Google Scholar 

  • Carroll RJ, Li KC (1992) Measurement error regression with unknown link: dimension reduction and data visualization. J Am Stat Assoc 87: 1040–1050

    Article  MathSciNet  MATH  Google Scholar 

  • Chen H (1991) Estimation of a projection-pursuit type regression model. Ann Stat 19: 142–157

    Article  MATH  Google Scholar 

  • Chen CH, Li KC (1998) Can SIR be as popular as multiple linear regression?. Stat Sin 8: 289–316

    MATH  Google Scholar 

  • Cook RD (2000) SAVE: a method for dimension reduction and graphics in regression. Commun Stat Theor Methods 29: 2109–2121

    Article  MATH  Google Scholar 

  • Cook RD, Li B (2002) Dimension reduction for conditional mean in regression. Ann Stat 30: 455–474

    Article  MathSciNet  MATH  Google Scholar 

  • Cook RD, Weisberg S (1991) Discussion of “Sliced inverse regression”. J Am Stat Assoc 86: 328–332

    Article  Google Scholar 

  • Cook RD (1998) Regression graphics. Ideas for studying regressions through graphics. Wiley Series in Probability and Statistics. Wiley, New York

    Google Scholar 

  • Cook RD, Weisberg S (1999) Applied statistics including computing and graphics. Wiley, New York

    Book  Google Scholar 

  • Duan N, Li KC (1991) Slicing regression: a link-free regression method. Ann Stat 19: 505–530

    Article  MathSciNet  MATH  Google Scholar 

  • Efron B (1982) The jackknife, the bootstrap and other resampling plans. CBMS-NSF regional conference series in applied mathematics, 38. Society for Industrial and Applied Mathematics (SIAM), Philadelphia

  • Ferré L (1997) Dimension choice for sliced inverse regression based on ranks. Student 2: 95–108

    Google Scholar 

  • Ferré L (1998) Determining the dimension in sliced inverse regression and related methods. J Am Stat Assoc 93: 132–140

    Article  MATH  Google Scholar 

  • Friedman JH, Stuetzle W (1981) Projection pursuit regression. J Am Stat Assoc 76: 817–823

    Article  MathSciNet  Google Scholar 

  • Gannoun A, Saracco J (2003) An asymptotic theory for SIR α method. Stat Sin 13: 297–310

    MathSciNet  MATH  Google Scholar 

  • Hall P (1989) On projection pursuit regression. Ann Stat 17: 573–588

    Article  MATH  Google Scholar 

  • Hall P, Li KC (1993) On almost linearity of low dimensional projections from high dimensional data. Ann Stat 21: 867–889

    Article  MathSciNet  MATH  Google Scholar 

  • Hsing T (1999) Nearest neighbor inverse regression. Ann Stat 27: 697–731

    Article  MathSciNet  MATH  Google Scholar 

  • Hsing T, Carroll RJ (1992) An asymptotic theory for sliced inverse regression. Ann Stat 20: 1040–1061

    Article  MathSciNet  MATH  Google Scholar 

  • Kötter T (2000) Sliced inverse regression. In: Schimek MG (ed) Approaches, computation, and application. Wiley, New York, pp 497–512

    Google Scholar 

  • Li KC (1991) Sliced inverse regression for dimension reduction, with discussion. J Am Stat Assoc 86: 316–342

    Article  MATH  Google Scholar 

  • Li KC (1992) On principal Hessian directions for data visualization and dimension reduction: another application of Stein’s lemma. J Am Stat Assoc 87: 1025–1039

    Article  MATH  Google Scholar 

  • Li Y, Zhu L-X (2007) Asymptotics for sliced average variance estimation. Ann Stat 35: 41–69

    Article  Google Scholar 

  • Liquet B, Saracco J (2008) Application of the bootstrap approach to the choice of dimension and the α parameter in the SIR α method. Commun Stat Simul Comput 37: 1198–1218

    Article  MathSciNet  MATH  Google Scholar 

  • Prendergast LA (2007) Implications of influence function analysis for sliced inverse regression and sliced average variance estimation. Biometrika 94: 585–601

    Article  MathSciNet  MATH  Google Scholar 

  • Saracco J (1997) An asymptotic theory for sliced inverse regression. Commun Stat Theor Methods 26: 2141–2171

    Article  MathSciNet  MATH  Google Scholar 

  • Saracco J (2001) Pooled slicing methods versus slicing methods. Commun Stat Simul Comput 30: 489–511

    Article  MathSciNet  MATH  Google Scholar 

  • Xia Y, Tong H, Li WK, Zhu L-X (2002) An adaptative estimation of dimension reduction space. J R Stat Soc B 64: 363–410

    Article  MathSciNet  MATH  Google Scholar 

  • Ye Z, Weiss RE (2003) Using the bootstrap to select one of a new class of dimension reduction methods. (English summary). J Am Stat Assoc 98: 968–979

    Article  MathSciNet  MATH  Google Scholar 

  • Yin X, Seymour L (2005) Asymptotic distributions for dimension reduction in the SIR-II method. Stat Sin 15: 1069–1079

    MathSciNet  MATH  Google Scholar 

  • Zhu LX, Fang KT (1996) Asymptotics for kernel estimate of sliced inverse regression. Ann Stat 24: 1053–1068

    Article  MathSciNet  MATH  Google Scholar 

  • Zhu L-X, Miao B, Peng H (2006) On sliced inverse regression with high dimensional covariates. J Am Stat Assoc 101: 630–643

    Article  MathSciNet  MATH  Google Scholar 

  • Zhu L-X, Ohtaki M, Li YX (2007) On hybrid methods of inverse regression based algorithms. Comput Stat Data Anal 51: 2621–2635

    Article  MathSciNet  MATH  Google Scholar 

  • Zhu LX, Ng KW (1995) Asymptotics of sliced inverse regression. Stat Sin 5: 727–736

    MathSciNet  MATH  Google Scholar 

  • Zhu L-P, Zhu L-X (2007) On kernel method for sliced average variance estimation. J Multivar Anal 98: 970–991

    Article  MATH  Google Scholar 

  • Zhou Y, Zhu L-P, Zhu L-X (2007) On splines approximations for sliced average variance estimation. J Stat Plan Inf (to appear)

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jérôme Saracco.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Liquet, B., Saracco, J. A graphical tool for selecting the number of slices and the dimension of the model in SIR and SAVE approaches. Comput Stat 27, 103–125 (2012). https://doi.org/10.1007/s00180-011-0241-9

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00180-011-0241-9

Keywords

Navigation