skip to main content
10.1145/3097983.3098050acmconferencesArticle/Chapter ViewAbstractPublication PageskddConference Proceedingsconference-collections
research-article

Randomization or Condensation?: Linear-Cost Matrix Sketching Via Cascaded Compression Sampling

Published:04 August 2017Publication History

ABSTRACT

Matrix sketching is aimed at finding compact representations of a matrix while simultaneously preserving most of its properties, which is a fundamental building block in modern scientific computing. Randomized algorithms represent state-of-the-art and have attracted huge interest from the fields of machine learning, data mining, and theoretic computer science. However, it still requires the use of the entire input matrix in producing desired factorizations, which can be a major computational and memory bottleneck in truly large problems. In this paper, we uncover an interesting theoretic connection between matrix low-rank decomposition and lossy signal compression, based on which a cascaded compression sampling framework is devised to approximate an m-by-n matrix in only O(m+n) time and space. Indeed, the proposed method accesses only a small number of matrix rows and columns, which significantly improves the memory footprint. Meanwhile, by sequentially teaming two rounds of approximation procedures and upgrading the sampling strategy from a uniform probability to more sophisticated, encoding-orientated sampling, significant algorithmic boosting is achieved to uncover more granular structures in the data. Empirical results on a wide spectrum of real-world, large-scale matrices show that by taking only linear time and space, the accuracy of our method rivals those state-of-the-art randomized algorithms consuming a quadratic, O(mn), amount of resources.

Skip Supplemental Material Section

Supplemental Material

zhang_linear_cost_matrix.mp4

mp4

447.1 MB

References

  1. O. Alter, P.O. Brown, and D. Botstein 2000. Singular value decomposition for genome-wide expression data processing and modeling. Proceedings of the National Academy of Sciences, Vol. 97, 18 (2000), 10101--10106. Google ScholarGoogle ScholarCross RefCross Ref
  2. R. Bhatia. 2015. Positive Definite Matrices. Princeton University Press.Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. C. Boutsidis, P. Drineas, and M. Magdon-Ismail 2004. Near-Optimal Column-Based Matrix Reconstruction. SIAM J. Comput. Vol. 43, 2 (2004), 687--717. Google ScholarGoogle ScholarCross RefCross Ref
  4. C. Boutsidis and A. Gittens 2013. Improved matrix algorithms via the Subsampled Randomized Hadamard Transform. SIAM Journal of MAtrix Analysis and Applications, Vol. 34, 3 (2013), 1301--1340. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. C. Boutsidis and D.P. Woodruff 2014. Optimal CUR matrix decompositions. In Proceedings of the forty-sixth annual ACM symposium on Theory of computing. 353--362. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. E.J. Candes and B. Recht 2009. Exact Matrix Completion via Matrix Completion. Foundations of Computational Mathematics Vol. 9 (2009), 717--772. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. K.L. Clarkson, P. Drineas, M. Magdon-Ismail, M.W. Mahoney, X. Meng, and D.P. Woodruff. 2013. The Fast Cauchy Transform and Faster Robust Linear Regression Annual ACM-SIAM Symposium on Discrete Algorithms. 466--477.Google ScholarGoogle Scholar
  8. K.L. Clarkson and D.P. Woodruff 2013. Low rank approximation and regression in input sparsity time Annual ACM Symposium on theory of computing.Google ScholarGoogle Scholar
  9. S. Deerwester, S.T. Dumais, G.W. Furnas, T.K. Landauer, and R. Harshman 1990. Indexing by latent semantic analysis. Journal of the American Society for Information Science, Vol. 41, 6 (1990), 391--407. Google ScholarGoogle ScholarCross RefCross Ref
  10. P. Drineas, R. Kannan, and M.W. Mahoney 2006. Fast Monte Carlo algorithms for matrices II: computing a low-rank approximation to a matrix. SIAM Journal of Computing Vol. 36, 1 (2006), 158--183. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. P. Drineas, M. Magdon-Ismail, M.W. Mahoney, and D.P. Woodruff 2012. Fast approximation of matrix coherence and statistical leverage. Journal of Machine Learning Research Vol. 13, 1 (2012), 3475--3506.Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. P. Drineas and M.W. Mahoney 2005. On the Nyström Method for Approximating a Gram Matrix for Improved Kernel-Based Learning. Journal of Machine Learning Research Vol. 6 (2005), 2153--2175.Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. C. Fowlkes, S. Belongie, F. Chung, and J. Malik. 2004. Spectral grouping using the Nyström method. IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 26, 2 (2004), 214--225. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. A. Frieze, R. Kannan, and S. Vempala 2004. Fast monte-carlo algorithms for finding low-rank approximations. J. ACM Vol. 51, 6 (2004), 1025--1041. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. G. H. Golub and C. F. Van Loan 1996. Matrix Computation. Johns Hopkins University Press.Google ScholarGoogle Scholar
  16. S.A. Goreinov, E.E. Tyrtyshnikov, and N.L. Zamarashki. 1997. A theory of pseudoskeleton approximations. Linear Algebra Appl. Vol. 261, 1 (1997), 1--21. Google ScholarGoogle ScholarCross RefCross Ref
  17. N. Halko, P.G. Martinsson, and J.A. Tropp. 2011. Finding Structure with Randomness: Probabilistic Algorithms for Constructing Approximate Matrix Decompositions. SIAM Rev. Vol. 53, 2 (2011), 217--288. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. S. Kumar, M. Mohri, and A. Talwalkar 2012. Sampling methods for the Nyström method. Journal of Machine Learning Research Vol. 13, 1 (2012), 981--1006.Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. L. Lan, K. Zhang, H. Ge, W. Cheng, J. Liu, A. Rauber, X. Li, J. Wang, and H. Zha. 2017. Low-rank Decomposition Meets Kernel Learning: A Generalized Nyström Method. Artificial Intelligence Vol. 250 (2017), 1--15. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. D.D. Lee and H.S. Seung 1999. Learning the parts of objects by non-negative matrix factorization. Nature, Vol. 401, 6755 (1999), 788--791. Google ScholarGoogle ScholarCross RefCross Ref
  21. E. Liberty. 2013. Simple and deterministic matrix sketching. In ACM SIGKDD international conference on Knowledge discovery and data mining. 581--588. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. E. Liberty, F. Woolfe, P. Martinsson, V. Rokhlin, and M. Tygert 2007. Randomized algorithms for the low-rank approximation of matrices. Proceedings of National Academy of Sciences, Vol. 104, 51 (2007), 20167--20172. Google ScholarGoogle ScholarCross RefCross Ref
  23. Y. Linde, A. Buzo, and R.M. Gray 1980. An Algorithm for Vector Quantizer Design. IEEE Transactions on Communications Vol. 28, 1 (1980), 84--95. Google ScholarGoogle ScholarCross RefCross Ref
  24. M.W. Mahoney. 2011. Randomized Algorithms for Matrices and Data. Foundations and Trends in Machine Learning Vol. 3, 2 (2011), 123--224.Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. M.W. Mahoney and P. Drineas 2009. CUR matrix decompositions for improved data analysis. Proceedings of National Academy of Sciences Vol. 106 (2009), 697--702. Google ScholarGoogle ScholarCross RefCross Ref
  26. I. Mitliagkas, C. Caramanis, and P. Jain. 2013. Memory limited, streaming PCA. In Advances in Neural Information Processing Systems. 2886--2894.Google ScholarGoogle Scholar
  27. J. Nelson and H.L. Nguyên 2015. OSNAP: Faster Numerical Linear Algebra Algorithms via Sparser Subspace Embeddings IEEE Annual Symposium on Foundations of Computer Science (FOCS). 135--143.Google ScholarGoogle Scholar
  28. S. Wang and Z. Zhang. 2012. A Scalable CUR Matrix Decomposition Algorithm: Lower Time Complexity and Tighter Bound. Advances in Neural Information Processing Systems 25. 647--655.Google ScholarGoogle Scholar
  29. S. Wang, Z. Zhang, and T. Zhang. 2016. Towards More Efficient Nystrom Approximation and CUR Matrix Decomposition. Journal of Machine Learning Research 17 (2016), 1--49.Google ScholarGoogle Scholar
  30. C.K. I. Williams and M. Seeger 2001. Using the Nystrom Method to Speed Up Kernel Machines Advances in Neural Information Processing Systems 13. 682--688.Google ScholarGoogle Scholar
  31. S. Yun, M. lelarge, and A. Proutiere 2015. Fast and Memory Optimal Low-Rank Matrix Approximation. Advances in Neural Information Processing Systems 28. 3177--3185.Google ScholarGoogle Scholar
  32. H. Zha, C. Ding, M. Gu, X. He, and H. Simon 2002. Spectral relaxation for K-means clustering.. In Neural Information Processing Systems 14.Google ScholarGoogle Scholar
  33. K. Zhang, L. Lan, Z. Wang, and F. Moerchen. 2012. Scaling up Kernel SVM on Limited Resources: A Low-rank Linearization Approach Proceedings of the Fifteenth International Conference on Artificial Intelligence and Statistics (Proceedings of Machine Learning Research), Vol. Vol. 22. 1425--1434.Google ScholarGoogle Scholar
  34. K. Zhang, I. Tsang, and J. Kwok 2008. Improved Nyström low-rank approximation and error analysis International conference on Machine learning. 1232--1239.Google ScholarGoogle Scholar
  35. T. Zhou and D. Tao. 2011. Godec: Randomized low-rank & sparse matrix decomposition in noisy case (2011) International Conference on Machine Learning.Google ScholarGoogle Scholar

Index Terms

  1. Randomization or Condensation?: Linear-Cost Matrix Sketching Via Cascaded Compression Sampling

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in
        • Published in

          cover image ACM Conferences
          KDD '17: Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining
          August 2017
          2240 pages
          ISBN:9781450348874
          DOI:10.1145/3097983

          Copyright © 2017 ACM

          Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

          Publisher

          Association for Computing Machinery

          New York, NY, United States

          Publication History

          • Published: 4 August 2017

          Permissions

          Request permissions about this article.

          Request Permissions

          Check for updates

          Qualifiers

          • research-article

          Acceptance Rates

          KDD '17 Paper Acceptance Rate64of748submissions,9%Overall Acceptance Rate1,133of8,635submissions,13%

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader