Skip to main content

Subspace Sampling and Relative-Error Matrix Approximation: Column-Row-Based Methods

  • Conference paper
Algorithms – ESA 2006 (ESA 2006)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 4168))

Included in the following conference series:

Abstract

Much recent work in the theoretical computer science, linear algebra, and machine learning has considered matrix decompositions of the following form: given an m ×n matrix A, decompose it as a product of three matrices, C, U, and R, where C consists of a small number of columns of A, R consists of a small number of rows of A, and U is a small carefully constructed matrix that guarantees that the product CUR is “close” to A. Applications of such decompositions include the computation of matrix “sketches”, speeding up kernel-based statistical learning, preserving sparsity in low-rank matrix representation, and improved interpretability of data analysis methods. Our main result is a randomized, polynomial algorithm which, given as input an m ×n matrix A, returns as output matrices C, U, R such that

$$\|{A-CUR}\|_F \leq (1+\epsilon)\|{A-A_k}\|_F$$

with probability at least 1–δ. Here, A k is the “best” rank-k approximation (provided by truncating the Singular Value Decomposition of A), and ||X|| F is the Frobenius norm of the matrix X. The number of columns in C and rows in R is a low-degree polynomial in k, 1/ε, and log(1/δ). Our main result is obtained by an extension of our recent relative error approximation algorithm for ℓ2 regression from overconstrained problems to general ℓ2 regression problems. Our algorithm is simple, and it takes time of the order of the time needed to compute the top k right singular vectors of A. In addition, it samples the columns and rows of A via the method of “subspace sampling,” so-named since the sampling probabilities depend on the lengths of the rows of the top singular vectors, and since they ensure that we capture entirely a certain subspace of interest.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Berry, M.W., Pulatova, S.A., Stewart, G.W.: Computing sparse reduced-rank approximations to sparse matrices. Technical Report UMIACS TR-2004-32 CMSC TR-4589, University of Maryland, College Park, MD (2004)

    Google Scholar 

  2. Bhatia, R.: Matrix Analysis. Springer, New York (1997)

    Google Scholar 

  3. Deshpande, A., Rademacher, L., Vempala, S., Wang, G.: Matrix approximation and projective clustering via volume sampling. In: Proceedings of the 17th Annual ACM-SIAM Symposium on Discrete Algorithms, pp. 1117–1126 (2006)

    Google Scholar 

  4. Drineas, P., Kannan, R., Mahoney, M.W.: Fast Monte Carlo algorithms for matrices I: Approximating matrix multiplication. SIAM Journal on Computing (to appear)

    Google Scholar 

  5. Drineas, P., Kannan, R., Mahoney, M.W.: Fast Monte Carlo algorithms for matrices II: Computing a low-rank approximation to a matrix. SIAM Journal on Computing (to appear)

    Google Scholar 

  6. Drineas, P., Kannan, R., Mahoney, M.W.: Fast Monte Carlo algorithms for matrices III: Computing a compressed approximate matrix decomposition. SIAM Journal on Computing (to appear)

    Google Scholar 

  7. Drineas, P., Mahoney, M.W.: On the Nyström method for approximating a Gram matrix for improved kernel-based learning. Journal of Machine Learning Research 6, 2153–2175 (2005)

    MathSciNet  Google Scholar 

  8. Drineas, P., Mahoney, M.W., Muthukrishnan, S.: Polynomial time algorithm for column-row based relative-error low-rank matrix approximation. Technical Report 2006-04, DIMACS (March 2006)

    Google Scholar 

  9. Drineas, P., Mahoney, M.W., Muthukrishnan, S.: Sampling algorithms for ℓ2. regression and applications. In: Proceedings of the 17th Annual ACM-SIAM Symposium on Discrete Algorithms, pp. 1127–1136 (2006)

    Google Scholar 

  10. Golub, G.H., Van Loan, C.F.: Matrix Computations. Johns Hopkins University Press, Baltimore (1989)

    MATH  Google Scholar 

  11. Goreinov, S.A., Tyrtyshnikov, E.E.: The maximum-volume concept in approximation by low-rank matrices. Contemporary Mathematics 280, 47–51 (2001)

    MathSciNet  Google Scholar 

  12. Goreinov, S.A., Tyrtyshnikov, E.E., Zamarashkin, N.L.: A theory of pseudoskeleton approximations. Linear Algebra and Its Applications 261, 1–21 (1997)

    Article  MATH  MathSciNet  Google Scholar 

  13. Horn, R.A., Johnson, C.R.: Matrix Analysis. Cambridge University Press, New York (1985)

    MATH  Google Scholar 

  14. Kuruvilla, F.G., Park, P.J., Schreiber, S.L.: Vector algebra in the analysis of genome-wide expression data. Genome Biology 3, 0011.1–0011.11 (2002)

    Article  Google Scholar 

  15. Lin, Z., Altman, R.B.: Finding haplotype tagging SNPs by use of principal components analysis. American Journal of Human Genetics 75, 850–861 (2004)

    Article  Google Scholar 

  16. Nashed, M.Z. (ed.): Generalized Inverses and Applications. Academic Press, New York (1976)

    MATH  Google Scholar 

  17. Paschou, P., Mahoney, M.W., Kidd, J.R., Pakstis, A.J., Gu, S., Kidd, K.K., Drineas, P.: Intra- and inter-population genotype reconstruction from tagging SNPs (manuscript submitted for publication)

    Google Scholar 

  18. Rademacher, L., Vempala, S., Wang, G.: Matrix approximation and projective clustering via iterative sampling. Technical Report MIT-LCS-TR-983, Massachusetts Institute of Technology, Cambridge, MA (March 2005)

    Google Scholar 

  19. Stewart, G.W.: Four algorithms for the efficient computation of truncated QR approximations to a sparse matrix. Numerische Mathematik 83, 313–323 (1999)

    Article  MATH  MathSciNet  Google Scholar 

  20. Stewart, G.W.: Error analysis of the quasi-Gram-Schmidt algorithm. Technical Report UMIACS TR-2004-17 CMSC TR-4572, University of Maryland, College Park, MD (2004)

    Google Scholar 

  21. Williams, C.K.I., Seeger, M.: Using the Nyström method to speed up kernel machines. In: Annual Advances in Neural Information Processing Systems 13: Proceedings of the 2000 Conference, pp. 682–688 (2001)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2006 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Drineas, P., Mahoney, M.W., Muthukrishnan, S. (2006). Subspace Sampling and Relative-Error Matrix Approximation: Column-Row-Based Methods. In: Azar, Y., Erlebach, T. (eds) Algorithms – ESA 2006. ESA 2006. Lecture Notes in Computer Science, vol 4168. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11841036_29

Download citation

  • DOI: https://doi.org/10.1007/11841036_29

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-38875-3

  • Online ISBN: 978-3-540-38876-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics