Skip to main content
Log in

An exact approach to sparse principal component analysis

  • Original Paper
  • Published:
Computational Statistics Aims and scope Submit manuscript

Abstract

We show a branch and bound approach to exactly find the best sparse dimension reduction of a matrix. We can choose between enforcing orthogonality of the coefficients and uncorrelation of the components, and can explicitly set the degree of sparsity. We suggest methods to choose the number of non-zero loadings for each component; illustrate and compare our approach with existing methods through a benchmark data set.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Cadima J, Jolliffe IT (1995) Loadings and correlations in the interpretation of principal components. J Appl Stat 22: 203–214

    Article  MathSciNet  Google Scholar 

  • Cattell RB (1966) The meaning and strategic use of factor analysis. In: Handbook of multivariate experimental psychology. Springer, Heidelberg

  • Chatfield C, Collins AJ (1980) Introduction to multivariate analysis. Chapman and Hall, London

    MATH  Google Scholar 

  • Coleman TF, Li Y (1994) On the convergence of reflective Newton methods for large-scale nonlinear minimization subject to bounds. Math Program 67: 189–224

    Article  MathSciNet  Google Scholar 

  • Collins J, Jaufer D, Vlachos P, Butler B, Suguru I (2004) Detecting collaborations in text comparing the authors’ rhetorical language choices in The Federalist Papers. Comput Hum 38: 15–36

    Article  Google Scholar 

  • d’Aspremont A, El Ghaoui L, Jordan M, Lanckriet GRG (2007) A direct formulation for sparse PCA using semidefinite programming. SIAM Rev 49: 434–448

    Article  MATH  MathSciNet  Google Scholar 

  • Gill PE, Murray W, Wright MH (1981) Practical optimization. Academic Press, London

    MATH  Google Scholar 

  • Guyon I, Elisseeff A (2003) An introduction to variable and feature selection. J Mach Learn Res 3: 1157–1182

    Article  MATH  Google Scholar 

  • Hand DJ (1981) Branch and bound in statistical data analysis. Statistician 30

  • Harrison D, Rubinfeld DL (1978) Hedonic prices and the demand for clean air. J Environ Econom Manage 5: 81–102

    Article  MATH  Google Scholar 

  • Horn JL (1965) A rationale and test for the number of factors in factor analysis. Psychometrika 30: 179–185

    Article  Google Scholar 

  • Jeffers J (1967) Two case studies in the application of principal components. Appl Stat 16: 225–236

    Article  Google Scholar 

  • Jolliffe I (2002) Principal component analysis. Springer, Heidelberg

    MATH  Google Scholar 

  • Jolliffe IT (1995) Rotation of principal components: choice of normalization constraints. J Appl Stat 22: 29–35

    Article  MathSciNet  Google Scholar 

  • Jolliffe IT, Trendafilov NT, Uddin M (2003) A modified principal component technique based on the lasso. J Comput Graph Stat 12: 531–547

    Article  MathSciNet  Google Scholar 

  • Kaiser HF (1960) The application of electronic computers to factor analysis. Educational Psychol Measurement 20: 141–151

    Article  Google Scholar 

  • Miller A (1990) Subset selection in regression. Chapman and Hall, London

    MATH  Google Scholar 

  • Moghaddam B, Weiss Y, Avidan S (2006) Spectral bounds for sparse PCA: exact and greedy algorithms. Adv Neural Inf Process Syst 18

  • R Development Core Team (2007) R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna, Austria

  • Trendafilov NT, Jolliffe IT (2006) Projected gradient approach to the numerical solution of the SCoTLASS. Comput Stat Data Anal 50: 242–253

    Article  MATH  MathSciNet  Google Scholar 

  • Wilkinson JH (1965) The algebraic eigenvalue problem. Oxford University Press, Oxford

    MATH  Google Scholar 

  • Zou H, Hastie T (2005) Regression shrinkage and selection via the elastic net. J Royal Statistical Soc (Ser. B) 67: 301–320

    Article  MATH  MathSciNet  Google Scholar 

  • Zou H, Hastie T, Tibshirani R (2004) Sparse principal components analysis. Technical Report, Department of Statistics, Stanford University, USA

  • Zwick WR, Velicer WF (1986) Comparison of five rules for determining the number of components to retain. Psychol Bull 99: 432–442

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Alessio Farcomeni.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Farcomeni, A. An exact approach to sparse principal component analysis. Comput Stat 24, 583–604 (2009). https://doi.org/10.1007/s00180-008-0147-3

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00180-008-0147-3

Keywords

Navigation