Subspace Sampling and Relative-Error Matrix Approximation: Column-Row-Based Methods

Drineas, Petros; Mahoney, Michael W.; Muthukrishnan, S.

doi:10.1007/11841036_29

Petros Drineas¹⁸,
Michael W. Mahoney¹⁹ &
S. Muthukrishnan²⁰

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 4168))

Included in the following conference series:

European Symposium on Algorithms

3233 Accesses
17 Citations

Abstract

Much recent work in the theoretical computer science, linear algebra, and machine learning has considered matrix decompositions of the following form: given an m ×n matrix A, decompose it as a product of three matrices, C, U, and R, where C consists of a small number of columns of A, R consists of a small number of rows of A, and U is a small carefully constructed matrix that guarantees that the product CUR is “close” to A. Applications of such decompositions include the computation of matrix “sketches”, speeding up kernel-based statistical learning, preserving sparsity in low-rank matrix representation, and improved interpretability of data analysis methods. Our main result is a randomized, polynomial algorithm which, given as input an m ×n matrix A, returns as output matrices C, U, R such that

$$\|{A-CUR}\|_F \leq (1+\epsilon)\|{A-A_k}\|_F$$

with probability at least 1–δ. Here, A _k is the “best” rank-k approximation (provided by truncating the Singular Value Decomposition of A), and ||X||_F is the Frobenius norm of the matrix X. The number of columns in C and rows in R is a low-degree polynomial in k, 1/ε, and log(1/δ). Our main result is obtained by an extension of our recent relative error approximation algorithm for ℓ₂ regression from overconstrained problems to general ℓ₂ regression problems. Our algorithm is simple, and it takes time of the order of the time needed to compute the top k right singular vectors of A. In addition, it samples the columns and rows of A via the method of “subspace sampling,” so-named since the sampling probabilities depend on the lengths of the rows of the top singular vectors, and since they ensure that we capture entirely a certain subspace of interest.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Low-Rank Approximation Algorithms for Matrix Completion with Random Sampling

Article 01 May 2021

Sublinear Cost Low Rank Approximation via Subspace Sampling

Robust non-parametric regression via incoherent subspace projections

Article 05 September 2021

References

Berry, M.W., Pulatova, S.A., Stewart, G.W.: Computing sparse reduced-rank approximations to sparse matrices. Technical Report UMIACS TR-2004-32 CMSC TR-4589, University of Maryland, College Park, MD (2004)
Google Scholar
Bhatia, R.: Matrix Analysis. Springer, New York (1997)
Google Scholar
Deshpande, A., Rademacher, L., Vempala, S., Wang, G.: Matrix approximation and projective clustering via volume sampling. In: Proceedings of the 17th Annual ACM-SIAM Symposium on Discrete Algorithms, pp. 1117–1126 (2006)
Google Scholar
Drineas, P., Kannan, R., Mahoney, M.W.: Fast Monte Carlo algorithms for matrices I: Approximating matrix multiplication. SIAM Journal on Computing (to appear)
Google Scholar
Drineas, P., Kannan, R., Mahoney, M.W.: Fast Monte Carlo algorithms for matrices II: Computing a low-rank approximation to a matrix. SIAM Journal on Computing (to appear)
Google Scholar
Drineas, P., Kannan, R., Mahoney, M.W.: Fast Monte Carlo algorithms for matrices III: Computing a compressed approximate matrix decomposition. SIAM Journal on Computing (to appear)
Google Scholar
Drineas, P., Mahoney, M.W.: On the Nyström method for approximating a Gram matrix for improved kernel-based learning. Journal of Machine Learning Research 6, 2153–2175 (2005)
MathSciNet Google Scholar
Drineas, P., Mahoney, M.W., Muthukrishnan, S.: Polynomial time algorithm for column-row based relative-error low-rank matrix approximation. Technical Report 2006-04, DIMACS (March 2006)
Google Scholar
Drineas, P., Mahoney, M.W., Muthukrishnan, S.: Sampling algorithms for ℓ₂. regression and applications. In: Proceedings of the 17th Annual ACM-SIAM Symposium on Discrete Algorithms, pp. 1127–1136 (2006)
Google Scholar
Golub, G.H., Van Loan, C.F.: Matrix Computations. Johns Hopkins University Press, Baltimore (1989)
MATH Google Scholar
Goreinov, S.A., Tyrtyshnikov, E.E.: The maximum-volume concept in approximation by low-rank matrices. Contemporary Mathematics 280, 47–51 (2001)
MathSciNet Google Scholar
Goreinov, S.A., Tyrtyshnikov, E.E., Zamarashkin, N.L.: A theory of pseudoskeleton approximations. Linear Algebra and Its Applications 261, 1–21 (1997)
Article MATH MathSciNet Google Scholar
Horn, R.A., Johnson, C.R.: Matrix Analysis. Cambridge University Press, New York (1985)
MATH Google Scholar
Kuruvilla, F.G., Park, P.J., Schreiber, S.L.: Vector algebra in the analysis of genome-wide expression data. Genome Biology 3, 0011.1–0011.11 (2002)
Article Google Scholar
Lin, Z., Altman, R.B.: Finding haplotype tagging SNPs by use of principal components analysis. American Journal of Human Genetics 75, 850–861 (2004)
Article Google Scholar
Nashed, M.Z. (ed.): Generalized Inverses and Applications. Academic Press, New York (1976)
MATH Google Scholar
Paschou, P., Mahoney, M.W., Kidd, J.R., Pakstis, A.J., Gu, S., Kidd, K.K., Drineas, P.: Intra- and inter-population genotype reconstruction from tagging SNPs (manuscript submitted for publication)
Google Scholar
Rademacher, L., Vempala, S., Wang, G.: Matrix approximation and projective clustering via iterative sampling. Technical Report MIT-LCS-TR-983, Massachusetts Institute of Technology, Cambridge, MA (March 2005)
Google Scholar
Stewart, G.W.: Four algorithms for the efficient computation of truncated QR approximations to a sparse matrix. Numerische Mathematik 83, 313–323 (1999)
Article MATH MathSciNet Google Scholar
Stewart, G.W.: Error analysis of the quasi-Gram-Schmidt algorithm. Technical Report UMIACS TR-2004-17 CMSC TR-4572, University of Maryland, College Park, MD (2004)
Google Scholar
Williams, C.K.I., Seeger, M.: Using the Nyström method to speed up kernel machines. In: Annual Advances in Neural Information Processing Systems 13: Proceedings of the 2000 Conference, pp. 682–688 (2001)
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Science, RPI,
Petros Drineas
Yahoo Research Labs,
Michael W. Mahoney
Department of Computer Science, Rutgers University,
S. Muthukrishnan

Authors

Petros Drineas
View author publications
You can also search for this author in PubMed Google Scholar
Michael W. Mahoney
View author publications
You can also search for this author in PubMed Google Scholar
S. Muthukrishnan
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Microsoft Research, One Microsoft Way, Redmond, WA 98052, USA and School of Computer Science, Tel-Aviv University, 69978, Tel-Aviv, Israel
Yossi Azar
Department of Computer Science, University of Leicester, University Road, LE1 7RH, Leicester, UK
Thomas Erlebach

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Drineas, P., Mahoney, M.W., Muthukrishnan, S. (2006). Subspace Sampling and Relative-Error Matrix Approximation: Column-Row-Based Methods. In: Azar, Y., Erlebach, T. (eds) Algorithms – ESA 2006. ESA 2006. Lecture Notes in Computer Science, vol 4168. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11841036_29

Download citation

DOI: https://doi.org/10.1007/11841036_29
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-38875-3
Online ISBN: 978-3-540-38876-0
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics