ABSTRACT
With the sheer growth of online user data, it becomes challenging to develop preference learning algorithms that are sufficiently flexible in modeling but also affordable in computation. In this paper we develop nonparametric matrix factorization methods by allowing the latent factors of two low-rank matrix factorization methods, the singular value decomposition (SVD) and probabilistic principal component analysis (pPCA), to be data-driven, with the dimensionality increasing with data size. We show that the formulations of the two nonparametric models are very similar, and their optimizations share similar procedures. Compared to traditional parametric low-rank methods, nonparametric models are appealing for their flexibility in modeling complex data dependencies. However, this modeling advantage comes at a computational price--it is highly challenging to scale them to large-scale problems, hampering their application to applications such as collaborative filtering. In this paper we introduce novel optimization algorithms, which are simple to implement, which allow learning both nonparametric matrix factorization models to be highly efficient on large-scale problems. Our experiments on EachMovie and Netflix, the two largest public benchmarks to date, demonstrate that the nonparametric models make more accurate predictions of user ratings, and are computationally comparable or sometimes even faster in training, in comparison with previous state-of-the-art parametric matrix factorization models.
- J. Abernethy, F. Bach, T. Evgeniou, and J.-P. Vert. Low-rank matrix factorization with attributes. Technical report, Ecole des Mines de Paris, 2006.Google Scholar
- R. M. Bell, Y. Koren, and C. Volinsky. The BellKor solution to the Netflix prize. Technical report, AT&T Labs, 2007.Google Scholar
- E. J. Cand`es and T. Tao. The power of convex relaxation: Near-optimal matrix completion. Submitted for publication, 2009.Google Scholar
- D. DeCoste. Collaborative prediction using ensembles of maximum margin matrix factorization. In The 23rd International Conference on Machine Learning (ICML), 2006. Google ScholarDigital Library
- M. Kurucz, A. A. Benczur, and K. Csalogany. Methods for large scale SVD with missing values. In Proceedings of KDD Cup and Workshop, 2007.Google Scholar
- Y. J. Lim and Y. W. Teh. Variational Bayesian approach to movie rating prediction. In Proceedings of KDD Cup and Workshop, 2007.Google Scholar
- C. E. Rasmussen and C. K. I. Williams. Gaussian Processes for Machine Learning. The MIT Press, 2006. Google ScholarDigital Library
- J. D. M. Rennie and N. Srebro. Fast maximum margin matrix factorization for collaborative prediction. In The 22nd International Conference on Machine Learning (ICML), 2005. Google ScholarDigital Library
- S. Roweis and Z. Ghahramani. A unifying review of linear Gaussian models. Neural Computation, 11:305--345, 1999. Google ScholarDigital Library
- R. Salakhutdinov and A. Mnih. Bayesian probabilistic matrix factorization using Markov chain Monte Carlo. In The 25th International Conference on Machine Learning (ICML), 2008. Google ScholarDigital Library
- B. Schölkopf and A. J. Smola. Learning with Kernels. MIT Press, 2002.Google Scholar
- N. Srebro, J. D. M. Rennie, and T. S. Jaakola. Maximum-margin matrix factorization. In Advances in Neural Information Processing Systems 18 (NIPS), 2005.Google Scholar
- G. Takacs, I. Pilaszy, B. Nemeth, and D. Tikk. On the gravity recommendation system. In Proceedings of KDD Cup and Workshop, 2007.Google Scholar
- M. E. Tipping and C. M. Bishop. Probabilistic principal component analysis. Journal of the Royal Statisitical Scoiety, B(61):611--622, 1999.Google Scholar
- M. Wu. Collaborative filtering via ensembles of matrix factorizations. In Proceedings of KDD Cup and Workshop, 2007.Google Scholar
- K. Yu, J. Lafferty, S. Zhu, and Y. Gong. Large-scale collaborative prediction using a nonparametric random effects model. In The 25th International Conference on Machine Learning (ICML), 2009. Google ScholarDigital Library
- K. Yu and V. Tresp. Learning to learn and collaborative filtering. In NIPS workshop on "Inductive Transfer: 10 Years Later", 2005.Google Scholar
- Y. Zhang and J. Koren. Efficient Bayesian hierarchical user modeling for recommendation systems. In The 30th ACM SIGIR Conference, 2007. Google ScholarDigital Library
Index Terms
- Fast nonparametric matrix factorization for large-scale collaborative filtering
Recommendations
Collaborative filtering using non-negative matrix factorisation
Collaborative filtering is a popular strategy in recommender systems area. This approach gathers users' ratings and then predicts what users will rate based on their similarity to other users. However, most of the collaborative filtering methods have ...
Co-manifold Matrix Factorization
ICCPR '20: Proceedings of the 2020 9th International Conference on Computing and Pattern RecognitionMatrix factorization plays a fundamental role in collaborative filtering. In collaborative filtering setting, the rating matrix R is very sparse. Thus, infinite number of matrices can fit the observed entries in the rating matrix. Without additional ...
Combining review-based collaborative filtering and matrix factorization: A solution to rating's sparsity problem
AbstractAn important factor affecting the performance of collaborative filtering for recommendation systems is the sparsity of the rating matrix caused by insufficient rating data. Improving the recommendation model and introducing side ...
Highlights- Collaborative filtering suffers from the sparsity issue.
- The proposed method ...
Comments