Subspace Sampling and Relative-Error Matrix Approximation: Column-Based Methods

Drineas, Petros; Mahoney, Michael W.; Muthukrishnan, S.

doi:10.1007/11830924_30

Petros Drineas²⁰,
Michael W. Mahoney²¹ &
S. Muthukrishnan²²

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 4110))

Included in the following conference series:

International Workshop on Approximation Algorithms for Combinatorial Optimization
International Workshop on Randomization and Approximation Techniques in Computer Science

2000 Accesses
27 Citations

Abstract

Given an m ×n matrix A and an integer k less than the rank of A, the “best” rank k approximation to A that minimizes the error with respect to the Frobenius norm is A _k, which is obtained by projecting A on the top k left singular vectors of A. While A _k is routinely used in data analysis, it is difficult to interpret and understand it in terms of the original data, namely the columns and rows of A. For example, these columns and rows often come from some application domain, whereas the singular vectors are linear combinations of (up to all) the columns or rows of A. We address the problem of obtaining low-rank approximations that are directly interpretable in terms of the original columns or rows of A. Our main results are two polynomial time randomized algorithms that take as input a matrix A and return as output a matrix C, consisting of a “small” (i.e., a low-degree polynomial in k, 1/ε, and log(1/δ)) number of actual columns of A such that

||A–CC ⁺ A||_F ≤(1+ε) ||A–A _k||_F

with probability at least 1–δ. Our algorithms are simple, and they take time of the order of the time needed to compute the top k right singular vectors of A. In addition, they sample the columns of A via the method of “subspace sampling,” so-named since the sampling probabilities depend on the lengths of the rows of the top singular vectors and since they ensure that we capture entirely a certain subspace of interest.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Low-Rank Approximation Algorithms for Matrix Completion with Random Sampling

Article 01 May 2021

Sublinear Cost Low Rank Approximation via Subspace Sampling

Low-rank approximation of large-scale matrices via randomized methods

Article 29 November 2017

References

Bhatia, R.: Matrix Analysis. Springer, New York (1997)
Google Scholar
Deshpande, A., Rademacher, L., Vempala, S., Wang, G.: Matrix approximation and projective clustering via volume sampling. In: Proceedings of the 17th Annual ACM-SIAM Symposium on Discrete Algorithms, pp. 1117–1126 (2006)
Google Scholar
Deshpande, A., Vempala, S.: Adaptive sampling and fast low-rank matrix approximation. Technical Report TR06-042, Electronic Colloquium on Computational Complexity (March 2006)
Google Scholar
Drineas, P., Frieze, A., Kannan, R., Vempala, S., Vinay, V.: Clustering in large graphs and matrices. In: Proceedings of the 10th Annual ACM-SIAM Symposium on Discrete Algorithms, pp. 291–299 (1999)
Google Scholar
Drineas, P., Kannan, R., Mahoney, M.W.: Fast Monte Carlo algorithms for matrices I: Approximating matrix multiplication. SIAM Journal on Computing (to appear)
Google Scholar
Drineas, P., Kannan, R., Mahoney, M.W.: Fast Monte Carlo algorithms for matrices II: Computing a low-rank approximation to a matrix. SIAM Journal on Computing (to appear)
Google Scholar
Drineas, P., Kannan, R., Mahoney, M.W.: Fast Monte Carlo algorithms for matrices III: Computing a compressed approximate matrix decomposition. SIAM Journal on Computing (to appear)
Google Scholar
Drineas, P., Mahoney, M.W., Muthukrishnan, S.: Polynomial time algorithm for column-row based relative-error low-rank matrix approximation. Technical Report 2006-04, DIMACS (March 2006)
Google Scholar
Drineas, P., Mahoney, M.W., Muthukrishnan, S.: Sampling algorithms for ℓ. regression and applications. In: Proceedings of the 17th Annual ACM-SIAM Symposium on Discrete Algorithms, pp. 1127–1136 (2006)
Google Scholar
Frieze, A., Kannan, R., Vempala, S.: Fast Monte-Carlo algorithms for finding low-rank approximations. In: Proceedings of the 39th Annual IEEE Symposium on Foundations of Computer Science, pp. 370–378 (1998)
Google Scholar
Frieze, A., Kannan, R., Vempala, S.: Fast Monte-Carlo algorithms for finding low-rank approximations. Journal of the ACM 51(6), 1025–1041 (2004)
Article MATH MathSciNet Google Scholar
Golub, G.H., Van Loan, C.F.: Matrix Computations. Johns Hopkins University Press, Baltimore (1989)
MATH Google Scholar
Har-Peled, S.: Low rank matrix approximation in linear time (manuscript, January 2006)
Google Scholar
Horn, R.A., Johnson, C.R.: Matrix Analysis. Cambridge University Press, New York (1985)
MATH Google Scholar
Kuruvilla, F.G., Park, P.J., Schreiber, S.L.: Vector algebra in the analysis of genome-wide expression data. Genome Biology, 3: research 0011.1–0011.11 (2002)
Google Scholar
Lin, Z., Altman, R.B.: Finding haplotype tagging SNPs by use of principal components analysis. American Journal of Human Genetics 75, 850–861 (2004)
Article Google Scholar
Nashed, M.Z. (ed.): Generalized Inverses and Applications. Academic Press, New York (1976)
MATH Google Scholar
Paschou, P., Mahoney, M.W., Kidd, J.R., Pakstis, A.J., Gu, S., Kidd, K.K., Drineas, P.: Intra- and inter-population genotype reconstruction from tagging SNPs (manuscript submitted for publication)
Google Scholar
Rademacher, L., Vempala, S., Wang, G.: Matrix approximation and projective clustering via iterative sampling. Technical Report MIT-LCS-TR-983, Massachusetts Institute of Technology, Cambridge, MA (March 2005)
Google Scholar
Rudelson, M., Vershynin, R.: Approximation of matrices (manuscript)
Google Scholar
Vershynin, R.: Coordinate restrictions of linear operators in \(l_2^n\) (manuscript)
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Science, RPI,
Petros Drineas
Yahoo Research Labs,
Michael W. Mahoney
Department of Computer Science, Rutgers University,
S. Muthukrishnan

Authors

Petros Drineas
View author publications
You can also search for this author in PubMed Google Scholar
Michael W. Mahoney
View author publications
You can also search for this author in PubMed Google Scholar
S. Muthukrishnan
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Departament de Llenguatges i Sistemes Informatics, Universitat Politecnica de Catalunya, Campus Nord - Ed. Omega, 240 Jordi Girona Salgado, 1-3 E-08034, Barcelona
Josep Díaz
Institute for Computer Science, University of Kiel, Olshausenstrasse 40, 24118, Kiel, Germany
Klaus Jansen
Centre Universitaire d’Informatique, Battelle Bâtiment A, Route de Drize 7,, 1227, Carouge, Geneva, Switzerland
José D. P. Rolim
School of Computer Science, Tel Aviv University, 69978, Tel Aviv, Israel
Uri Zwick

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Drineas, P., Mahoney, M.W., Muthukrishnan, S. (2006). Subspace Sampling and Relative-Error Matrix Approximation: Column-Based Methods. In: Díaz, J., Jansen, K., Rolim, J.D.P., Zwick, U. (eds) Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques. APPROX RANDOM 2006 2006. Lecture Notes in Computer Science, vol 4110. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11830924_30

Download citation

DOI: https://doi.org/10.1007/11830924_30
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-38044-3
Online ISBN: 978-3-540-38045-0
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Subspace Sampling and Relative-Error Matrix Approximation: Column-Based Methods

Abstract

Access this chapter

Preview

Similar content being viewed by others

Low-Rank Approximation Algorithms for Matrix Completion with Random Sampling

Sublinear Cost Low Rank Approximation via Subspace Sampling

Low-rank approximation of large-scale matrices via randomized methods

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

Subspace Sampling and Relative-Error Matrix Approximation: Column-Based Methods

Abstract

Access this chapter

Preview

Similar content being viewed by others

Low-Rank Approximation Algorithms for Matrix Completion with Random Sampling

Sublinear Cost Low Rank Approximation via Subspace Sampling

Low-rank approximation of large-scale matrices via randomized methods

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation