Abstract
Principal component analysis (PCA) is often used to reduce the dimension of data by selecting a few orthonormal vectors that explain most of the variance structure of the data. \(L_1\) PCA uses the \(L_1\) norm to measure error, whereas the conventional PCA uses the \(L_2\) norm. For the \(L_1\) PCA problem minimizing the fitting error of the reconstructed data, we propose three algorithms based on iteratively reweighted least squares. We first develop an exact reweighted algorithm. Next, an approximate version is developed based on eigenpair approximation when the algorithm is near convergent. Finally, the approximate version is extended based on stochastic singular value decomposition. We provide convergence analyses, and compare their performance against benchmark algorithms in the literature. The computational experiment shows that the proposed algorithms consistently perform the best and the scalability is improved as we use eigenpair approximation and stochastic singular value decomposition.




Similar content being viewed by others
Notes
Iteratively reweighted least squares (IRLS) is an algorithmic framework to solve weighted least squares where the weights depend on the model parameters. Since the weights change based on the model parameters, iterations are needed until some convergence or a termination criteria are met.
References
Baccini A, Besse P, de Faguerolles A (March 1996) A \(L_1\)-norm PCA and a heuristic approach. In: Proceedings of the international conference on ordinal and symbolic data analysis, pp 359–368
Bache K, Lichman M (2013) UCI Machine Learning Repository
Brooks J, Dula J, Boone E (2013) A pure \(L_1\)-norm principal component analysis. Comput Stat Data Anal 61:83–98
Brooks J, Jot S (2012) pcaL1: An implementation in R of three methods for \(L_1\)-norm principal component analysis. Optimization Online
Candés EJ, Li X, Ma Y, Wright J (2011) Robust principal component analysis? J ACM 58(3):11
Choulakian V (2006) \(L_1\)-norm projection pursuit principal component analysis. Comput Stat Data Anal 50(6):1441–1451
Croux C, Ruiz-Gazen A (2005) High breakdown estimators for principal components: the projection-pursuit approach revisited. J Multivar Anal 95(1):206–226
Daubechies I, DeVore R, Fornasier M, Gntrk CS (2010) Iteratively reweighted least squares minimization for sparse recovery. Commun Pure Appl Math 63(1):1–38
Galpin J, Hawkins D (1987) Methods of \(L_1\) estimation of a covariance matrix. Comput Stat Data Anal 5(4):305–319
Geladi P, Kowalski BR (1986) Partial least-squares regression: a tutorial. Anal Chim Acta 185:1–17
Halko N, Martinsson PG, Tropp JA (2011) Finding structure with randomness: probabilistic algorithms for constructing approximate matrix decompositions. SIAM Rev 53(2):217–288
Jolliffe I (2002) Principal component analysis. Springer, New York
Jorgensen M (2006) Iteratively reweighted least squares. Wiley, New York
Jot S, Brooks P, Visentin A, Park YW (2016) pcaL1: \(L_1\)-norm PCA methods. R package version 1.4.1
Ke Q, Kanade T (June 2005) Robust \(L_1\) norm factorization in the presence of outliers and missing data by alternative convex programming. In: Proceedings of the 2005 IEEE computer society conference on computer vision and pattern recognition, pp 739–746
Kwak N (2008) Principal component analysis based on \(L_1\)-norm maximization. IEEE Trans Pattern Anal Mach Intell 30(9):1672–1680
Lax PD (2007) Linear algebra and its applications. Wiley, Hoboken
Li G, Chen Z (1985) Projection-pursuit approach to robust dispersion matrices and principal components: primary theory and monte carlo. J Am Stat Assoc 80(391):759–766
Markopoulos PP, Karystinos GN, Pados DA (2014) Optimal algorithms for \(L_1\)-subspace signal processing. IEEE Trans Signal Process 62(19):5046–5058
McCoy M, Tropp JA (2011) Two proposals for robust PCA using semidefinite programming. Electron J Stat 5:1123–1160
Nie F, Huang H, Ding C, Luo D, Wang H (June 2011) Robust principal component analysis with non-greedy \(L_1\)-norm maximization. In: Proceeding of 22nd international conference on artificial intelligence, pp 1433–1438
Park YW, Klabjan D (2016) Iteratively reweighted least squares algorithms for \(L_1\)-norm principal component analysis. In 2016 IEEE international conference on data mining, Barcelona
Roweis S (1998) EM Algorithms for PCA and SPCA. In: Advances in neural information processing systems, pp 626–632
R Core Team (2014) R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria
Shmueli Y, Wolf G, Averbuch A (2012) Updating kernel methods in spectral decomposition by affinity perturbations. Linear Algebra Appl 437:1356–1365
Wen Z, Yin W (2013) A feasible method for optimization with orthogonality constraints. Math Program 142(1–2):397–434
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Park, Y.W., Klabjan, D. Three iteratively reweighted least squares algorithms for \(L_1\)-norm principal component analysis. Knowl Inf Syst 54, 541–565 (2018). https://doi.org/10.1007/s10115-017-1069-6
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10115-017-1069-6