Abstract
Column-based low-rank matrix approximation is a useful method to analyze and interpret data in machine learning and data mining. However existing methods will face some accuracy and scalability problems when dealing with large-scale data. In this paper we propose a new parallel framework for column-based low-rank matrix approximation based on divide-and-conquer strategy. It consists of three stages: (1) Dividing the original matrix into several small submatrices. (2) Performing column-based low-rank matrix approximation to select columns on each submatrix in parallel. (3) Combining these columns into the final result. We prove that the new parallel framework has (1+\(\epsilon \)) relative-error upper bound. We also show that it is more scalable than existing work. The results of comparison experiments and application in kernel methods demonstrate the effectiveness and efficiency of our method on both synthetic and real world datasets.
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Boutsidis, C., Drineas, P., Magdon-Ismail, M.: Near optimal column-based matrix reconstruction. In: Proceedings of the IEEE 52nd Annual Symposium on Foundations of Computer Science (FOCS), pp. 305–314 (2011)
Boutsidis, C., Mahoney, M.W., Drineas, P.: An improved approximation algorithm for the column subset selection problem. In: Proceedings of the Twentieth Annual ACM-SIAM Symposium on Discrete Algorithms, pp. 968–977 (2009)
Breese, J.S., Heckerman, D., Kadie, C.: Empirical analysis of predictive algorithms for collaborative filtering. In: Proceedings of the Fourteenth Conference on Uncertainty in Artificial Intelligence, pp. 43–52 (1998)
Chang, C.C., Lin, C.J.: LIBSVM: A library for support vector machines. ACM Transactions on Intelligent Systems and Technology 2, 27:1–27:27 (2011)
Civril, A., Magdon-Ismail, M.: Column subset selection via sparse approximation of svd. Theoretical Computer Science 421, 1–14 (2012)
Clarkson, K.L., Woodruff, D.P.: Low rank approximation and regression in input sparsity time. In: Proceedings of the Forty-Fifth Annual ACM Symposium on Theory of Computing, pp. 81–90 (2013)
Deshpande, A., Rademacher, L., Vempala, S., Wang, G.: Matrix approximation and projective clustering via volume sampling. Theory of Computing 2, 225–247 (2006)
Dorai, C., Venkatesh, S.: Media computing: computational media aesthetics, vol. 4. Springer Science & Business Media (2002)
Drineas, P., Kannan, R.: Pass efficient algorithms for approximating large matrices. In: Proceedings of the 14th Annual ACM-SIAM Symposium on Discrete Algorithms, pp. 223–232 (2003)
Drineas, P., Mahoney, M.W., Muthukrishnan, S.M.: Subspace Sampling and Relative-Error Matrix Approximation: Column-Based Methods. In: Zwick, U., DÃaz, J., Jansen, K., Rolim, J.D.P. (eds.) APPROX 2006 and RANDOM 2006. LNCS, vol. 4110, pp. 316–326. Springer, Heidelberg (2006)
Drineas, P., Mahoney, M.W., Muthukrishnan, S.: Relative-error CUR matrix decompositions. SIAM Journal on Matrix Analysis and Applications 30(2), 844–881 (2008)
Georghiades, A.S., Belhumeur, P.N., Kriegman, D.: From few to many: Illumination cone models for face recognition under variable lighting and pose. IEEE Transactions on Pattern Analysis and Machine Intelligence 23(6), 643–660 (2001)
Golub, G., Van Loan, C.: Matrix Computations. Johns Hopkins University Press, Baltimore (1996)
Guruswami, V., Sinop, A.K.: Optimal column-based low-rank matrix reconstruction. In: Proceedings of the Twenty-Third Annual ACM-SIAM Symposium on Discrete Algorithms, pp. 1207–1214 (2012)
Halko, N., Martinsson, P.G., Tropp, J.A.: Finding structure with randomness: Probabilistic algorithms for constructing approximate matrix decompositions. SIAM Review 53(2), 217–288 (2011)
Lee, K.C., Ho, J., Kriegman, D.: Acquiring linear subspaces for face recognition under variable lighting. IEEE Transactions on Pattern Analysis and Machine Intelligence 27(5), 684–698 (2005)
Mackey, L.W., Jordan, M.I., Talwalkar, A.: Divide-and-conquer matrix factorization. In: Advances in Neural Information Processing Systems, pp. 1134–1142 (2011)
Pi, Y., Peng, H., Zhou, S., Zhang, Z.: A scalable approach to column-based low-rank matrix approximation. In: Proceedings of the 23rd International Joint Conference on Artificial Intelligence (IJCAI 2013), pp. 1600–1606 (2013)
Saunders, C., Gammerman, A., Vovk, V.: Ridge regression learning algorithm in dual variables. In: Proceedings of the 15th International Conference on Machine Learning, pp. 515–521 (1998)
Schuler, D.: Social computing. Communications of the ACM 37(1), 28–29 (1994)
Suykens, J., Vandewalle, J.: Least squares support vector machine classifiers. Neural Processing Letters 9(3), 293–300 (1999)
Tyrtyshnikov, E.: Incomplete cross approximation in the mosaic-skeleton method. Computing 64(4), 367–380 (2000)
Vapnik, V.N.: Statistical Learning Theory. John Wiley & Sons, New York (1998)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer International Publishing Switzerland
About this paper
Cite this paper
Wu, J., Liao, S. (2015). Accuracy-Preserving and Scalable Column-Based Low-Rank Matrix Approximation. In: Zhang, S., Wirsing, M., Zhang, Z. (eds) Knowledge Science, Engineering and Management. KSEM 2015. Lecture Notes in Computer Science(), vol 9403. Springer, Cham. https://doi.org/10.1007/978-3-319-25159-2_22
Download citation
DOI: https://doi.org/10.1007/978-3-319-25159-2_22
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-25158-5
Online ISBN: 978-3-319-25159-2
eBook Packages: Computer ScienceComputer Science (R0)