Exploit latent Dirichlet allocation for collaborative filtering

Li, Zhoujun; Zhang, Haijun; Wang, Senzhang; Huang, Feiran; Li, Zhenping; Zhou, Jianshe

doi:10.1007/s11704-016-6078-1

Exploit latent Dirichlet allocation for collaborative filtering

Research Article
Published: 11 May 2018

Volume 12, pages 571–581, (2018)
Cite this article

Frontiers of Computer Science Aims and scope Submit manuscript

Zhoujun Li¹,
Haijun Zhang^1,2,
Senzhang Wang^3,4,
Feiran Huang¹,
Zhenping Li² &
…
Jianshe Zhou⁵

155 Accesses
9 Citations
1 Altmetric
Explore all metrics

Abstract

Previous work on the one-class collaborative filtering (OCCF) problem can be roughly categorized into pointwise methods, pairwise methods, and content-based methods. A fundamental assumption of these approaches is that all missing values in the user-item rating matrix are considered negative. However, this assumption may not hold because the missing values may contain negative and positive examples. For example, a user who fails to give positive feedback about an item may not necessarily dislike it; he may simply be unfamiliar with it. Meanwhile, content-based methods, e.g. collaborative topic regression (CTR), usually require textual content information of the items, and thus their applicability is largely limited when the text information is not available. In this paper, we propose to apply the latent Dirichlet allocation (LDA) model on OCCF to address the above-mentioned problems. The basic idea of this approach is that items are regarded as words, users are considered as documents, and the user-item feedback matrix constitutes the corpus. Our model drops the strong assumption that missing values are all negative and only utilizes the observed data to predict a user’s interest. Additionally, the proposed model does not need content information of the items. Experimental results indicate that the proposed method outperforms previous methods on various ranking-oriented evaluation metrics. We further combine this method with a matrix factorization-based method to tackle the multi-class collaborative filtering (MCCF) problem, which also achieves better performance on predicting user ratings.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Exploiting User Feedbacks in Matrix Factorization for Recommender Systems

Make Users and Preferred Items Closer: Recommendation via Distance Metric Learning

Enhanced knowledge transfer for collaborative filtering with multi-source heterogeneous feedbacks

Article 03 April 2021

Discover the latest articles and news from researchers in related subjects, suggested using machine learning.

References

Pan W K, Chen L. GBPR: group preference based Bayesian personalized ranking for one-class collaborative filtering. In: Proceedings of the 23rd International Joint Conference on Artificial Intelligence. 2013, 2691–2697
Google Scholar
Pan R, Zhou Y H, Cao B, Liu N N, Lukose R, Scholz M, Yang Q. One-class collaborative filtering. In: Proceedings of the 8th IEEE In ternational Conference on Data Mining. 2008, 502–511
Google Scholar
Rendle S, Freudenthaler C, Gantner Z, Schmidt-Thieme L. BPR: Bayesian personalized ranking from implicit feedback. In: Proceedings of the 25th Conference on Uncertainty in Artificial Intelligence. 2009, 452–461
Google Scholar
Li Y, Hu J, Zhai C X, Chen Y. Improving one-class collaborative filtering by incorporating rich user information. In: Proceedings of the 19th ACM Conference on Information and Knowledge Management. 2010, 959–968
Google Scholar
Hu Y F, Koren Y, Volinsky C. Collaborative filtering for implicit feedback datasets. In: Proceedings of the 8th IEEE International Conference on Data Mining. 2008, 263–272
Google Scholar
Zhang H J, Li Z J, Chen Y, Zhang X M, Wang S Z. Exploit latent Dirichlet allocation for one-class collaborative filtering. In: Proceedings of the 23rd ACM International Conference on Information and Knowledge Management. 2014, 1991–1994
Google Scholar
Li B, Yang Q, Xue X Y. Can movies and books collaborate? Crossdomain collaborative filtering for sparsity reduction. In: Proceedings of the International Joint Conference on Artificial Intelligence. 2009, 2052–2057
Google Scholar
Hofmann T. Latent semantic models for collaborative filtering. ACM Transactions on Information Systems. 2004, 22(1): 89–115
Article Google Scholar
Salakhutdinov R, Mnih A, Hinton G. Restricted Boltzmann machines for collaborative filtering. In: Proceedings of the 24th International Conference on Machine Learning. 2007, 791–798
Google Scholar
Zhang H J, Liu C Y, Li Z J, Zhang X M. Collaborative filtering based on rating psychology. In: Proceedings of International Conference on Web-Age Information Management. 2013, 655–665
Chapter Google Scholar
Gu B, Sheng V S, Tay K Y, Romano W, Li S. Incremental support vector learning for ordinal regression. IEEE Transactions on Neural Networks and Learning Systems, 2014, 26(7): 1403–1416
Article MathSciNet Google Scholar
Gu B, Sun X M, Sheng V S. Structural minimax probability machine. IEEE Transactions on Neural Networks and Learning Systems, 2016, 28(7): 1646–1656
Article MathSciNet Google Scholar
Ma T H, Zhou J J, Tang M L, Tian Y, Al-Dhelaan A, Al-Rodhaan M, Lee S. Social network and tag sources based augmenting collaborative recommender system. IEICE Transactions on Information and Systems, 2015, E98-D(4): 902–910
Article Google Scholar
Koren Y, Bell R, Volinsky C. Matrix factorization techniques for recommender systems. Computer, 2009, 42(8): 30–37
Article Google Scholar
Ma H, Yang H X, Lyu M R, King I. SoRec: social recommendation using probabilistic matrix factorization. In: Proceedings of the 17th ACM Conference on Information and Knowledge Management. 2008, 931–940
Google Scholar
Funk S. Netflix update: try this at home. Blog Post Sifter, 2006
Google Scholar
Srebro N, Jaakkola T.Weighted low-rank approximations. In: Proceedings of the 20th International Conference on Machine Learning. 2003, 720–727
Google Scholar
Cremonesi P, Koren Y, Turrin R. Performance of recommender algorithms on top-n recommendation tasks. In: Proceedings of the 4th ACM Conference on Recommender Systems. 2010, 39–46
Google Scholar
Mnih A, Salakhutdinov R R. Probabilistic matrix factorization. In: Proceedings of International Conference on Machine Learning. 2012, 880–887
Google Scholar
He H B, Garcia E A. Learning from imbalanced data. IEEE Transactions on Knowledge and Data Engineering, 2009, 21(9): 1263–1284
Article Google Scholar
Wang C, Blei D M. Collaborative topic modeling for recommending scientific articles. In: Proceedings of the 17th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 2011, 448–456
Google Scholar
Purushotham S, Liu Y, Kuo C C J. Collaborative topic regression with social matrix factorization for recommendation systems. Computer Science, 2012
Google Scholar
Blei D M, Ng A Y, Jordan M I. Latent dirichlet allocation. Journal of Machine Learning Research, 2003, 3: 993–1022
MATH Google Scholar
Chen Y, Yin X S, Li Z J, Hu X H, Huang J X. A LDA-based approach to promoting ranking diversity for genomics information retrieval. BMC Genomics, 2012, 13(Suppl 3): 104–111
Google Scholar
Wilson J, Chaudhury S, Lall B. Improving collaborative filtering based recommenders using topic modelling. In: Proceedings of IEEE/WIC/ACM International Joint Conference on Artificial Intelligence. 2014, 340–346
Google Scholar
Hofmann T. Probabilistic latent semantic indexing. In: Proceedings of the 22nd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. 1999, 56–73
Google Scholar
Heinrich G. Parameter estimation for text analysis. Technical Report. 2005
Google Scholar
Gantner Z, Rendle S, Freudenthaler C, Schmidt-Thieme L. MyMediaLite: a free recommender system library. In: Proceedings of the 5th ACM Conference on Recommender Systems. 2011, 305–308
Google Scholar
Newman D, Asuncion A U, Smyth P, Asuncion A U. Distributed inference for latent Dirichlet allocation. In: Proceedings of Conference on Neural Information Processing Systems. 2007, 1–6
Google Scholar
Wang Y, Bai H, Stanton M, Chen W Y, Chang E Y. PLDA: parallel latent Dirichlet allocation for large-scale applications. In: Proceedings of International Conference on Algorithmic Aspects in Information and Management. 2009, 301–314
Chapter Google Scholar
Magnusson M, Jonsson L, Villani M, Broman D. Parallelizing LDA using partially collapsed Gibbs sampling. Statistics, 2015
Google Scholar

Download references

Acknowledgments

We greatly appreciate Weike Pan for his codes of algorithm GBPR[1], which makes us able to evaluate the algorithm more efficiently and more fairly. This work was supported by the National Natural Science Foundation of China (NSFC) (Grant Nos. 61370126, 61672081, 71540028, 61571052, 61602237), National High-tech R&D Program of China (2015AA016004), Beijing Advanced Innovation Center for Imaging Technology (BAICIT-2016001), the Fund of the State Key Laboratory of Software Development Environment (SKLSDE-2013ZX-19), the Fund of Beijing Social Science (14JGC103), the Statistics Research Project of National Bureau (2013LY055), and the Fund of Beijing Wuzi University, China (GJB20141002).

Author information

Authors and Affiliations

State Key Laboratory of Software Development Environment, Beihang University, Beijing, 100191, China
Zhoujun Li, Haijun Zhang & Feiran Huang
School of Information, Beijing Wuzi University, Beijing, 101149, China
Haijun Zhang & Zhenping Li
College of Computer Science and Technology, Nanjing University of Aeronautics and Astronautics, Nanjing, 211106, China
Senzhang Wang
Collaborative Innovation Center of Novel Software Technology and Industrialization, Nanjing, 211106, China
Senzhang Wang
Beijing Advanced Innovation Center for Imaging Technology, Capital Normal University, Beijing, 100048, China
Jianshe Zhou

Authors

Zhoujun Li
View author publications
Search author on:PubMed Google Scholar
Haijun Zhang
View author publications
Search author on:PubMed Google Scholar
Senzhang Wang
View author publications
Search author on:PubMed Google Scholar
Feiran Huang
View author publications
Search author on:PubMed Google Scholar
Zhenping Li
View author publications
Search author on:PubMed Google Scholar
Jianshe Zhou
View author publications
Search author on:PubMed Google Scholar

Corresponding author

Correspondence to Haijun Zhang.

Additional information

Zhoujun Li received the BS degree in the School of Computer Science, Wuhan University, China in 1984, and the MS and PhD degrees in the School of Computer Science, National University of Defense Technology, China. Currently, he is working as a professor of Beihang University, China. His research interests include data mining, information retrieval, and information security. He is a member of the IEEE.

Haijun Zhang received the BS and MS degrees in management science from the China University of Mining and Technology, China in 1997 and 2004 respectively. He received his PhD degree from the School of Computer Science and Engineering, Beihang Univeristy, China in 2016. He is working at School of Information, Beijing Wuzi University, China since 2004. His major interests are data mining and recommendation system.

Senzhang Wang received his PhD degree from Beihang Univeristy, China in 2015. He is currently an assistant professor at the College of Computer Science and Technology, Nanjing University of Aeronautics and Astronautics, China. His research focus is on data mining, social computing, big data, and urban computing. He has published more than 30 paper in premier conferences and journals in computer science including SIGKDD, AAAI, SDM, ACM SIGSPAIAL, ECML-PKDD, ACM Transactions on Intelligent Systems and Technology, Knowledge and Information Systems, IEEE Transactions on Intelligent Transportation Systems, IEEE Transactions on Multimedia, et al.

Feiran Huang received his bachelor degree in Central South University, China in 2011. He is currently the PhD candidate of the School of Computer Science and Engineering, Beihang University, China. His research interests include data mining and social media analysis.

Zhenping Li received her bachelor and master degrees in operations research from Shandong University, China in 1989 and 1994 respectively. In 2004, she received her doctoral degree in operations research from Chinese Academy of Sciences, China. Her research interests include the complex network and intelligent algorithm. She is now a professor in School of Information at Beijing Wuzi University, China.

Jianshe Zhou received the PhD degree from Wuhan University, China in 2002. He is now a professor in Capital Normal University, China. His research interests include linguistics, logic, and language intelligence.

Electronic supplementary material

Supplementary material, approximately 300 KB.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Li, Z., Zhang, H., Wang, S. et al. Exploit latent Dirichlet allocation for collaborative filtering. Front. Comput. Sci. 12, 571–581 (2018). https://doi.org/10.1007/s11704-016-6078-1

Download citation

Received: 03 February 2016
Accepted: 22 August 2016
Published: 11 May 2018
Issue Date: June 2018
DOI: https://doi.org/10.1007/s11704-016-6078-1

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Exploit latent Dirichlet allocation for collaborative filtering

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Exploiting User Feedbacks in Matrix Factorization for Recommender Systems

Make Users and Preferred Items Closer: Recommendation via Distance Metric Learning

Enhanced knowledge transfer for collaborative filtering with multi-source heterogeneous feedbacks

Explore related subjects

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Additional information

Electronic supplementary material

Supplementary material, approximately 300 KB.

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now