ABSTRACT
Matrix sketching is aimed at finding compact representations of a matrix while simultaneously preserving most of its properties, which is a fundamental building block in modern scientific computing. Randomized algorithms represent state-of-the-art and have attracted huge interest from the fields of machine learning, data mining, and theoretic computer science. However, it still requires the use of the entire input matrix in producing desired factorizations, which can be a major computational and memory bottleneck in truly large problems. In this paper, we uncover an interesting theoretic connection between matrix low-rank decomposition and lossy signal compression, based on which a cascaded compression sampling framework is devised to approximate an m-by-n matrix in only O(m+n) time and space. Indeed, the proposed method accesses only a small number of matrix rows and columns, which significantly improves the memory footprint. Meanwhile, by sequentially teaming two rounds of approximation procedures and upgrading the sampling strategy from a uniform probability to more sophisticated, encoding-orientated sampling, significant algorithmic boosting is achieved to uncover more granular structures in the data. Empirical results on a wide spectrum of real-world, large-scale matrices show that by taking only linear time and space, the accuracy of our method rivals those state-of-the-art randomized algorithms consuming a quadratic, O(mn), amount of resources.
Supplemental Material
- O. Alter, P.O. Brown, and D. Botstein 2000. Singular value decomposition for genome-wide expression data processing and modeling. Proceedings of the National Academy of Sciences, Vol. 97, 18 (2000), 10101--10106. Google ScholarCross Ref
- R. Bhatia. 2015. Positive Definite Matrices. Princeton University Press.Google ScholarDigital Library
- C. Boutsidis, P. Drineas, and M. Magdon-Ismail 2004. Near-Optimal Column-Based Matrix Reconstruction. SIAM J. Comput. Vol. 43, 2 (2004), 687--717. Google ScholarCross Ref
- C. Boutsidis and A. Gittens 2013. Improved matrix algorithms via the Subsampled Randomized Hadamard Transform. SIAM Journal of MAtrix Analysis and Applications, Vol. 34, 3 (2013), 1301--1340. Google ScholarDigital Library
- C. Boutsidis and D.P. Woodruff 2014. Optimal CUR matrix decompositions. In Proceedings of the forty-sixth annual ACM symposium on Theory of computing. 353--362. Google ScholarDigital Library
- E.J. Candes and B. Recht 2009. Exact Matrix Completion via Matrix Completion. Foundations of Computational Mathematics Vol. 9 (2009), 717--772. Google ScholarDigital Library
- K.L. Clarkson, P. Drineas, M. Magdon-Ismail, M.W. Mahoney, X. Meng, and D.P. Woodruff. 2013. The Fast Cauchy Transform and Faster Robust Linear Regression Annual ACM-SIAM Symposium on Discrete Algorithms. 466--477.Google Scholar
- K.L. Clarkson and D.P. Woodruff 2013. Low rank approximation and regression in input sparsity time Annual ACM Symposium on theory of computing.Google Scholar
- S. Deerwester, S.T. Dumais, G.W. Furnas, T.K. Landauer, and R. Harshman 1990. Indexing by latent semantic analysis. Journal of the American Society for Information Science, Vol. 41, 6 (1990), 391--407. Google ScholarCross Ref
- P. Drineas, R. Kannan, and M.W. Mahoney 2006. Fast Monte Carlo algorithms for matrices II: computing a low-rank approximation to a matrix. SIAM Journal of Computing Vol. 36, 1 (2006), 158--183. Google ScholarDigital Library
- P. Drineas, M. Magdon-Ismail, M.W. Mahoney, and D.P. Woodruff 2012. Fast approximation of matrix coherence and statistical leverage. Journal of Machine Learning Research Vol. 13, 1 (2012), 3475--3506.Google ScholarDigital Library
- P. Drineas and M.W. Mahoney 2005. On the Nyström Method for Approximating a Gram Matrix for Improved Kernel-Based Learning. Journal of Machine Learning Research Vol. 6 (2005), 2153--2175.Google ScholarDigital Library
- C. Fowlkes, S. Belongie, F. Chung, and J. Malik. 2004. Spectral grouping using the Nyström method. IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 26, 2 (2004), 214--225. Google ScholarDigital Library
- A. Frieze, R. Kannan, and S. Vempala 2004. Fast monte-carlo algorithms for finding low-rank approximations. J. ACM Vol. 51, 6 (2004), 1025--1041. Google ScholarDigital Library
- G. H. Golub and C. F. Van Loan 1996. Matrix Computation. Johns Hopkins University Press.Google Scholar
- S.A. Goreinov, E.E. Tyrtyshnikov, and N.L. Zamarashki. 1997. A theory of pseudoskeleton approximations. Linear Algebra Appl. Vol. 261, 1 (1997), 1--21. Google ScholarCross Ref
- N. Halko, P.G. Martinsson, and J.A. Tropp. 2011. Finding Structure with Randomness: Probabilistic Algorithms for Constructing Approximate Matrix Decompositions. SIAM Rev. Vol. 53, 2 (2011), 217--288. Google ScholarDigital Library
- S. Kumar, M. Mohri, and A. Talwalkar 2012. Sampling methods for the Nyström method. Journal of Machine Learning Research Vol. 13, 1 (2012), 981--1006.Google ScholarDigital Library
- L. Lan, K. Zhang, H. Ge, W. Cheng, J. Liu, A. Rauber, X. Li, J. Wang, and H. Zha. 2017. Low-rank Decomposition Meets Kernel Learning: A Generalized Nyström Method. Artificial Intelligence Vol. 250 (2017), 1--15. Google ScholarDigital Library
- D.D. Lee and H.S. Seung 1999. Learning the parts of objects by non-negative matrix factorization. Nature, Vol. 401, 6755 (1999), 788--791. Google ScholarCross Ref
- E. Liberty. 2013. Simple and deterministic matrix sketching. In ACM SIGKDD international conference on Knowledge discovery and data mining. 581--588. Google ScholarDigital Library
- E. Liberty, F. Woolfe, P. Martinsson, V. Rokhlin, and M. Tygert 2007. Randomized algorithms for the low-rank approximation of matrices. Proceedings of National Academy of Sciences, Vol. 104, 51 (2007), 20167--20172. Google ScholarCross Ref
- Y. Linde, A. Buzo, and R.M. Gray 1980. An Algorithm for Vector Quantizer Design. IEEE Transactions on Communications Vol. 28, 1 (1980), 84--95. Google ScholarCross Ref
- M.W. Mahoney. 2011. Randomized Algorithms for Matrices and Data. Foundations and Trends in Machine Learning Vol. 3, 2 (2011), 123--224.Google ScholarDigital Library
- M.W. Mahoney and P. Drineas 2009. CUR matrix decompositions for improved data analysis. Proceedings of National Academy of Sciences Vol. 106 (2009), 697--702. Google ScholarCross Ref
- I. Mitliagkas, C. Caramanis, and P. Jain. 2013. Memory limited, streaming PCA. In Advances in Neural Information Processing Systems. 2886--2894.Google Scholar
- J. Nelson and H.L. Nguyên 2015. OSNAP: Faster Numerical Linear Algebra Algorithms via Sparser Subspace Embeddings IEEE Annual Symposium on Foundations of Computer Science (FOCS). 135--143.Google Scholar
- S. Wang and Z. Zhang. 2012. A Scalable CUR Matrix Decomposition Algorithm: Lower Time Complexity and Tighter Bound. Advances in Neural Information Processing Systems 25. 647--655.Google Scholar
- S. Wang, Z. Zhang, and T. Zhang. 2016. Towards More Efficient Nystrom Approximation and CUR Matrix Decomposition. Journal of Machine Learning Research 17 (2016), 1--49.Google Scholar
- C.K. I. Williams and M. Seeger 2001. Using the Nystrom Method to Speed Up Kernel Machines Advances in Neural Information Processing Systems 13. 682--688.Google Scholar
- S. Yun, M. lelarge, and A. Proutiere 2015. Fast and Memory Optimal Low-Rank Matrix Approximation. Advances in Neural Information Processing Systems 28. 3177--3185.Google Scholar
- H. Zha, C. Ding, M. Gu, X. He, and H. Simon 2002. Spectral relaxation for K-means clustering.. In Neural Information Processing Systems 14.Google Scholar
- K. Zhang, L. Lan, Z. Wang, and F. Moerchen. 2012. Scaling up Kernel SVM on Limited Resources: A Low-rank Linearization Approach Proceedings of the Fifteenth International Conference on Artificial Intelligence and Statistics (Proceedings of Machine Learning Research), Vol. Vol. 22. 1425--1434.Google Scholar
- K. Zhang, I. Tsang, and J. Kwok 2008. Improved Nyström low-rank approximation and error analysis International conference on Machine learning. 1232--1239.Google Scholar
- T. Zhou and D. Tao. 2011. Godec: Randomized low-rank & sparse matrix decomposition in noisy case (2011) International Conference on Machine Learning.Google Scholar
Index Terms
- Randomization or Condensation?: Linear-Cost Matrix Sketching Via Cascaded Compression Sampling
Recommendations
Matrix Sketching Over Sliding Windows
SIGMOD '16: Proceedings of the 2016 International Conference on Management of DataLarge-scale matrix computation becomes essential for many data data applications, and hence the problem of sketching matrix with small space and high precision has received extensive study for the past few years. This problem is often considered in the ...
Improving CUR matrix decomposition and the Nyström approximation via adaptive sampling
The CUR matrix decomposition and the Nyström approximation are two important low-rank matrix approximation techniques. The Nyström method approximates a symmetric positive semidefinite matrix in terms of a small number of its columns, while CUR ...
Scalable task-based algorithm for multiplication of block-rank-sparse matrices
IA3 '15: Proceedings of the 5th Workshop on Irregular Applications: Architectures and AlgorithmsA task-based formulation of Scalable Universal Matrix Multiplication Algorithm (SUMMA), a popular algorithm for matrix multiplication (MM), is applied to the multiplication of hierarchy-free, rank-structured matrices that appear in the domain of quantum ...
Comments