Article

Sparse eigen methods by D.C. programming

Authors:

Bharath K. Sriperumbudur,

David A. Torres,

Gert R. G. LanckrietAuthors Info & Claims

ICML '07: Proceedings of the 24th international conference on Machine learning

Pages 831 - 838

https://doi.org/10.1145/1273496.1273601

Published: 20 June 2007 Publication History

Abstract

Eigenvalue problems are rampant in machine learning and statistics and appear in the context of classification, dimensionality reduction, etc. In this paper, we consider a cardinality constrained variational formulation of generalized eigenvalue problem with sparse principal component analysis (PCA) as a special case. Using l₁-norm approximation to the cardinality constraint, previous methods have proposed both convex and non-convex solutions to the sparse PCA problem. In contrast, we propose a tighter approximation that is related to the negative log-likelihood of a Student's t-distribution. The problem is then framed as a d.c. (difference of convex functions) program and is solved as a sequence of locally convex programs. We show that the proposed method not only explains more variance with sparse loadings on the principal directions but also has better scalability compared to other methods. We demonstrate these results on a collection of datasets of varying dimensionality, two of which are high-dimensional gene datasets where the goal is to find few relevant genes that explain as much variance as possible.

References

[1]

Alon, U., Barkai, N., Notterman, D. A., Gish, K., Ybarra, S., Mack, D., & Levine, A. J. (1999). Broad patterns of gene expression revealed by clustering analysis of tumor and normal colon cancer tissues. Cell Biology, 96, 6745--6750.

[2]

Cadima, J., & Jolliffe, I. (1995). Loadings and correlations in the interpretation of principal components. Applied Statistics, 203--214.

[3]

d'Aspremont, A., El Ghaoui, L., Jordan, M. I., & Lanckriet, G. R. G. (2005). A direct formulation for sparse PCA using semidefinite programming. Advances in Neural Information Processing Systems 17 (pp. 41--48). Cambridge, MA: MIT Press.

[4]

El Ghaoui, L. (2006). On the quality of a semidefinite programming bound for sparse principal component analysis. arXive.org.

[5]

Golub et al. (1999). Molecular classification of cancer: Class discovery and class prediction by gene expression monitoring. Science, 286, 531--537.

[6]

Horst, R., & Thoai, N. V. (1999). D.c. programming: Overview. Journal of Optimization Theory and Applications, 103, 1--43.

Digital Library

[7]

Jeffers, J. (1967). Two case studies in the application of principal components. Applied Statistics, 16, 225--236.

[8]

Jolliffe, I. T., Trendafilov, N. T., & Uddin, M. (2003). A modified principal component technique based on the LASSO. Journal of Computational and Graphical Statistics, 12, 531--547.

[9]

Lovász, L., & Schrijver, A. (1991). Cones of matrices and set-functions and 0-1 optimization. SIAM J. Optim., 166--190.

[10]

Mangasarian, O. L. (1997). Solution of general linear complementarity problems via nondifferentiable concave minimization. Acta Mathematica Vietnamica, 22, 199--205.

[11]

Moghaddam, B., Weiss, Y., & Avidan, S. (2007). Spectral bounds for sparse PCA: Exact and greedy algorithms. Advances in Neural Information Processing Systems 19. Cambridge, MA: MIT Press.

[12]

Rockafellar, R. T. (1970). Convex analysis. Princeton, NJ: Princeton University Press.

[13]

Sjöstrand, K. (2005). Matlab implementation of LASSO, LARS, the Elastic Net and SPCA (Technical Report). Informatics and Mathematical Modelling, Technical University of Denmark.

[14]

Tao, P. D., & An, L. T. H. (1998). D.c. optimization algorithms for solving the trust region subproblem. SIAM J. Optim., 476--505.

Digital Library

[15]

Tipping, M. E. (2001). Sparse Bayesian learning and the relevance vector machine. Journal of Machine Learning Research, 1, 211--244.

Digital Library

[16]

Vandenberghe, L., & Boyd, S. (1996). Semidefinite programming. SIAM Review, 49--95.

Digital Library

[17]

Weston, J., Elisseeff, A., Schölkopf, B., & Tipping, M. (2003). Use of the zero-norm with linear models and kernel methods. Journal of Machine Learning Research, 3, 1439--1461.

Digital Library

[18]

Yuille, A. L., & Rangarajan., A. (2003). The concaveconvex procedure. Neural Computation, 915--936.

Digital Library

[19]

Zou, H., Hastie, T., & Tibshirani, R. (2004). Sparse principal component analysis (Technical Report). Statistics Department, Stanford University.

Cited By

Liao FKim JBarnum CKyrillidis ASalakhutdinov RKolter ZHeller KWeller AOliver NScarlett JBerkenkamp F(2024)On the error-propagation of inexact Hotelling's deflation for principal component analysisProceedings of the 41st International Conference on Machine Learning10.5555/3692070.3693268(29720-29747)Online publication date: 21-Jul-2024
https://dl.acm.org/doi/10.5555/3692070.3693268
Bassily RCortes CMao AMohri MSalakhutdinov RKolter ZHeller KWeller AOliver NScarlett JBerkenkamp F(2024)Differentially private domain adaptation with theoretical guaranteesProceedings of the 41st International Conference on Machine Learning10.5555/3692070.3692197(3168-3196)Online publication date: 21-Jul-2024
https://dl.acm.org/doi/10.5555/3692070.3692197
Awasthi PMao AMohri MZhong Y(2024)DC-programming for neural network optimizationsJournal of Global Optimization10.1007/s10898-023-01344-2Online publication date: 2-Jan-2024
https://doi.org/10.1007/s10898-023-01344-2
Show More Cited By

Sparse eigen methods by D.C. programming

Recommendations

Regularized generalized eigen-decomposition with applications to sparse supervised feature extraction and sparse discriminant analysis

We propose a general technique for obtaining sparse solutions to generalized eigenvalue problems, and call it Regularized Generalized Eigen-Decomposition (RGED). For decades, Fisher's discriminant criterion has been applied in supervised feature ...
Exact Sparse Approximation Problems via Mixed-Integer Programming: Formulations and Computational Performance

Sparse approximation addresses the problem of approximately fitting a linear model with a solution having as few non-zero components as possible. While most sparse estimation algorithms rely on suboptimal formulations, this work studies the performance ...
Sparse Approximation via Penalty Decomposition Methods

In this paper we consider sparse approximation problems, that is, general $l_0$ minimization problems with the $l_0$-``norm” of a vector being a part of constraints or objective function. In particular, we first study the first-order optimality conditions ...

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences

ICML '07: Proceedings of the 24th international conference on Machine learning

June 2007

1233 pages

ISBN:9781595937933

DOI:10.1145/1273496

Editor:
Zoubin Ghahramani
University of Cambridge, United Kingdom

Copyright © 2007 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Machine Learning Journal

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 20 June 2007

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Qualifiers

Article

Conference

ICML '07 & ILP '07

Sponsor:

ICML '07 & ILP '07: The 24th Annual International Conference on Machine Learning held in conjunction with the 2007 International Conference on Inductive Logic Programming

June 20 - 24, 2007

Oregon, Corvalis, USA

Acceptance Rates

Overall Acceptance Rate 140 of 548 submissions, 26%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

48
Total Citations
View Citations
429
Total Downloads

Downloads (Last 12 months)22
Downloads (Last 6 weeks)2

Reflects downloads up to 03 Mar 2025

Other Metrics

View Author Metrics

Citations

Cited By

Liao FKim JBarnum CKyrillidis ASalakhutdinov RKolter ZHeller KWeller AOliver NScarlett JBerkenkamp F(2024)On the error-propagation of inexact Hotelling's deflation for principal component analysisProceedings of the 41st International Conference on Machine Learning10.5555/3692070.3693268(29720-29747)Online publication date: 21-Jul-2024
https://dl.acm.org/doi/10.5555/3692070.3693268
Bassily RCortes CMao AMohri MSalakhutdinov RKolter ZHeller KWeller AOliver NScarlett JBerkenkamp F(2024)Differentially private domain adaptation with theoretical guaranteesProceedings of the 41st International Conference on Machine Learning10.5555/3692070.3692197(3168-3196)Online publication date: 21-Jul-2024
https://dl.acm.org/doi/10.5555/3692070.3692197
Awasthi PMao AMohri MZhong Y(2024)DC-programming for neural network optimizationsJournal of Global Optimization10.1007/s10898-023-01344-2Online publication date: 2-Jan-2024
https://doi.org/10.1007/s10898-023-01344-2
Xie YWang TKim JLee KJeong M(2024)Least angle sparse principal component analysis for ultrahigh dimensional dataAnnals of Operations Research10.1007/s10479-024-06428-0Online publication date: 18-Dec-2024
https://doi.org/10.1007/s10479-024-06428-0
Awasthi PCortes CMohri M(2024)Best-effort adaptationAnnals of Mathematics and Artificial Intelligence10.1007/s10472-023-09917-392:2(393-438)Online publication date: 13-Jan-2024
https://doi.org/10.1007/s10472-023-09917-3
Xie YWang TJeong YTosyali AJeong M(2023)True sparse PCA for reducing the number of essential sensors in virtual metrologyInternational Journal of Production Research10.1080/00207543.2023.221728262:6(2142-2157)Online publication date: 28-May-2023
https://doi.org/10.1080/00207543.2023.2217282
Han NWu JFang XWen JZhan SXie SLi X(2020)Transferable Linear Discriminant AnalysisIEEE Transactions on Neural Networks and Learning Systems10.1109/TNNLS.2020.296674631:12(5630-5638)Online publication date: Dec-2020
https://doi.org/10.1109/TNNLS.2020.2966746
Jiang WZhang TWang H(2020)Joint Sparse Principal Component Analysis Based Roust Sparse Fault Detection2020 IEEE 9th Data Driven Control and Learning Systems Conference (DDCLS)10.1109/DDCLS49620.2020.9275214(1234-1239)Online publication date: 20-Nov-2020
https://doi.org/10.1109/DDCLS49620.2020.9275214
Le Thi H(2019)Collaborative DCA: an intelligent collective optimization scheme, and its application for clusteringJournal of Intelligent & Fuzzy Systems10.3233/JIFS-179358(1-8)Online publication date: 6-Aug-2019
https://doi.org/10.3233/JIFS-179358
Yuan GShen LZheng W(2019)A Decomposition Algorithm for the Sparse Generalized Eigenvalue Problem2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)10.1109/CVPR.2019.00627(6106-6115)Online publication date: Jun-2019
https://doi.org/10.1109/CVPR.2019.00627
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Table of Conten