ABSTRACT
Matrix completion latent factors models are known to be an effective method to build recommender systems. Currently, stochastic gradient descent (SGD) is considered one of the best latent factor-based algorithm for matrix completion. In this paper we discuss GASGD, a distributed asynchronous variant of SGD for large-scale matrix completion, that (i) leverages data partitioning schemes based on graph partitioning techniques, (ii) exploits specific characteristics of the input data and (iii) introduces an explicit parameter to tune synchronization frequency among the computing nodes. We empirically show how, thanks to these features, GASGD achieves a fast convergence rate incurring in smaller communication cost with respect to current asynchronous distributed SGD implementations.
Supplemental Material
- A. Ahmed, N. Shervashidze, S. Narayanamurthy, V. Josifovski, and A. J. Smola. Distributed large-scale natural graph factorization. In Proceedings of the 22nd international conference on World Wide Web, 2013. Google ScholarDigital Library
- J. Bennett and S. Lanning. The netix prize. In Proceedings of KDD cup and workshop, 2007.Google Scholar
- J. Dean, G. Corrado, R. Monga, K. Chen, M. Devin, Q. V. Le, M. Z. Mao, M. Ranzato, A. W. Senior, P. A. Tucker, et al. Large scale distributed deep networks. In Conference on Neural Information Processing Systems, 2012.Google Scholar
- G. Dror, N. Koenigstein, Y. Koren, and M. Weimer. The yahoo! music dataset and kdd-cup'11. Journal of Machine Learning Research-Proceedings Track, 18:8--18, 2012.Google Scholar
- R. Gemulla, E. Nijkamp, P. J. Haas, and Y. Sismanis. Large-scale matrix factorization with distributed stochastic gradient descent. In Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining, 2011. Google ScholarDigital Library
- J. E. Gonzalez, Y. Low, H. Gu, D. Bickson, and C. Guestrin. Powergraph: Distributed graph-parallel computation on natural graphs. In Proceedings of the 10th USENIX Symposium on Operating Systems Design and Implementation (OSDI), 2012. Google ScholarDigital Library
- Y. Koren, R. Bell, and C. Volinsky. Matrix factorization techniques for recommender systems. Computer, 42(8):30--37, 2009. Google ScholarDigital Library
- J. Mangalindan. Amazon's recommendation secret. http://tech.fortune.cnn.com/2012/07/30/amazon-5 CNN Money, 2012.Google Scholar
- F. Niu, B. Recht, C. Ré, and S. J. Wright. Hogwild!: A lock-free approach to parallelizing stochastic gradient descent. Advances in Neural Information Processing Systems, 24:693--701, 2011.Google ScholarDigital Library
- B. Recht and C. Ré. Parallel stochastic gradient algorithms for large-scale matrix completion. Mathematical Programming Computation, 5(2):201--226, 2013.Google ScholarCross Ref
- X. Su and T. M. Khoshgoftaar. A survey of collaborative filtering techniques. Advances in Artificial Intelligence, 2009:4, 2009. Google ScholarDigital Library
- G. Takács, I. Pilászy, B. Németh, and D. Tikk. Scalable collaborative filtering approaches for large recommender systems. The Journal of Machine Learning Research, 10:623--656, 2009. Google ScholarDigital Library
- C. Teioudi, F. Makari, and R. Gemulla. Distributed matrix completion. In International Conference on Data Mining, 2012. Google ScholarDigital Library
- J. N. Tsitsiklis, D. P. Bertsekas, M. Athans, et al. Distributed asynchronous deterministic and stochastic gradient optimization algorithms. IEEE transactions on automatic control, 31(9):803--812, 1986.Google Scholar
- Y. Zhou, D. Wilkinson, R. Schreiber, and R. Pan. Large-scale parallel collaborative filtering for the netix prize. In Algorithmic Aspects in Information and Management, pages 337--348. Springer, 2008. Google ScholarDigital Library
- Y. Zhuang, W.-S. Chin, Y.-C. Juan, and C.-J. Lin. A fast parallel sgd for matrix factorization in shared memory systems. In Proceedings of the 7th ACM conference on Recommender systems, 2013. Google ScholarDigital Library
Index Terms
- GASGD: stochastic gradient descent for distributed asynchronous matrix completion via graph partitioning.
Recommendations
A fast parallel SGD for matrix factorization in shared memory systems
RecSys '13: Proceedings of the 7th ACM conference on Recommender systemsMatrix factorization is known to be an effective method for recommender systems that are given only the ratings from users to items. Currently, stochastic gradient descent (SGD) is one of the most popular algorithms for matrix factorization. However, as ...
A Fast Parallel Stochastic Gradient Method for Matrix Factorization in Shared Memory Systems
Matrix factorization is known to be an effective method for recommender systems that are given only the ratings from users to items. Currently, stochastic gradient (SG) method is one of the most popular algorithms for matrix factorization. However, as a ...
Large-scale matrix factorization with distributed stochastic gradient descent
KDD '11: Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data miningWe provide a novel algorithm to approximately factor large matrices with millions of rows, millions of columns, and billions of nonzero elements. Our approach rests on stochastic gradient descent (SGD), an iterative stochastic optimization algorithm. We ...
Comments