Abstract
Matrix factorization (MF) methods have superior recommendation performance and are flexible to incorporate other side information, but it is hard for humans to interpret the derived latent factors. Recently, the item-item cooccurrence information is exploited to learn item embeddings and enhance the recommendation performance. However, the item-item co-occurrence information, constructed from the sparse and long-tail distributed user-item interaction matrix, is over-estimated for rare items, which could lead to bias in learned item embeddings. In this paper, we seek to evaluate and improve the interpretability of item embeddings by leveraging a dense item-tag relevance matrix. Specifically, we design two metrics to quantitatively evaluate the interpretability of item embeddings from different viewpoints: interpretability of individual dimensions of item embeddings and semantic coherence of local neighborhoods in the latent space. We also propose a tag-informed item embedding (TIE) model that jointly factorizes the user-item interaction matrix, the item-item co-occurrence matrix and the item-tag relevance matrix with shared item embeddings so that different forms of information can co-operate with each other to learn better item embeddings. Experiments on the MovieLens20M dataset demonstrate that compared with other state-of-the-art MF methods, TIE achieves better top-N recommendations, and the relative improvement is larger when the user-item interaction matrix becomes sparser. By leveraging the itemtag relevance information, individual dimensions of item embeddings are more interpretable and local neighborhoods in the latent space are more semantically coherent; the bias in learned item embeddings are also mitigated to some extent.
Similar content being viewed by others
References
Salakhutdinov R, Mnih A. Probabilistic matrix factorization. In: Proceedings of the 20th International Conference on Neural Information Processing Systems. 2007, 1257–1264
Hu Y, Koren Y, Volinsky C. Collaborative filtering for implicit feedback datasets. In: Proceedings of the 2008 IEEE International Conference on Data Mining. 2008, 263–272
Pan R, Zhou Y, Cao B, Liu N N, Lukose R, Scholz M, Yang Q. One-class collaborative filtering. In: Proceedings of the 2008 IEEE International Conference on Data Mining. 2008, 502–511
Koren Y, Bell R, Volinsky C. Matrix factorization techniques for recommender systems. Computer, 2009, 42(8): 30–37
Mikolov T, Sutskever I, Chen K, Corrado G, Dean J. Distributed representations of words and phrases and their compositionality. In: Proceedings of the 26th International Conference on Neural Information Processing Systems. 2013, 3111–3119
Levy O, Goldberg Y. Neural word embedding as implicit matrix factorization. In: Proceedings of the 27th International Conference on Neural Information Processing Systems. 2014, 2177–2185
Pennington J, Socher R, Manning C D. GloVe: global vectors for word representation. In: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing. 2014, 1532–1543
Zhou N, Zhao W X, Zhang X, Wen J R, Wang S. A general multi-context embedding model for mining human trajectory data. IEEE Transactions on Knowledge and Data Engineering, 2016, 28(8): 1945–1958
Liang D, Altosaar J, Charlin L, Blei D M. Factorization meets the item embedding: regularizing matrix factorization with item co-occurrence. In: Proceedings of the 10th ACM Conference on Recommender Systems. 2016, 59–66
Park C, Kim D, Oh J, Yu H. Do “also-viewed” products help user rating prediction? In: Proceedings of the 26th International Conference on World Wide Web. 2017, 1113–1122
Cao D, Nie L, He X, Wei X, Zhu S, Chua T S. Embedding factorization models for jointly recommending items and user generated lists. In: Proceedings of the 40th International ACM SIGIR Conference on Research and Development in Information Retrieval. 2017, 585–594
Turney P D, Pantel P. From frequency to meaning: vector space models of semantics. Journal of Artificial Intelligence Research, 2010, 37: 141–188
Vig J, Sen S, Riedl J. The tag genome: encoding community knowledge to support novel interaction. ACM Transactions on Interactive Intelligent Systems, 2012, 2(3): 13
Chang J, Boyd-Graber J, Gerrish S, Wang C, Blei D M. Reading tea leaves: how humans interpret topic models. In: Proceedings of the 22nd International Conference on Neural Information Processing Systems. 2009, 288–296
Murphy B, Talukdar P P, Mitchell T. Learning effective and interpretable semantic models using non-negative sparse embedding. In: Proceedings of the 24th International Conference on Computational Linguistics. 2012, 1933–1950
Faruqui M, Tsvetkov Y, Yogatama D, Dyer C, Smith N A. Sparse over-complete word vector representations. In: Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing. 2015, 1491–1500
Sun F, Guo J, Lan Y, Xu J, Cheng X. Sparse word embeddings using ℓ1 regularized online learning. In: Proceedings of the 25th International Joint Conference on Artificial Intelligence. 2016, 2915–2921
Cremonesi P, Koren Y, Turrin R. Performance of recommender algorithms on top-n recommendation tasks. In: Proceedings of the 4th ACM Conference on Recommender Systems. 2010, 39–46
He X, Zhang H, Kan M Y, Chua T S. Fast matrix factorization for online recommendation with implicit feedback. In: Proceedings of the 39th International ACM SIGIR Conference on Research and Development in Information Retrieval. 2016, 549–558
Lian D, Ge Y, Zhang F, Yuan N J, Xie X, Zhou T, Rui Y. Scalable content-aware collaborative filtering for location recommendation. IEEE Transactions on Knowledge and Data Engineering, 2018, 30(6): 1122–1135
Anderson C. The long tail. Wired Magazine, 2004, 12(10): 170–177
Sen S, Harper F M, LaPitz A, Riedl J. The quest for quality tags. In: Proceedings of the 2007 International ACM Conference on Supporting Group Work. 2007, 361–370
Yu H F, Hsieh C J, Si S, Dhillon I. Scalable coordinate descent approaches to parallel matrix factorization for recommender systems. In: Proceedings of the 2012 IEEE International Conference on Data Mining. 2012, 765–774
Levy M, Sandler M. A semantic space for music derived from social tags. In: Proceedings of the 8th International Conference on Music Information Retrieval. 2007, 411–416
Sinha R, Swearingen K. The role of transparency in recommender systems. In: Proceedings of the 2002 Conference on Human Factors in Computing Systems. 2002, 830–831
Harper F M, Konstan J A. The movielens datasets: history and context. ACM Transactions on Interactive Intelligent Systems, 2015, 5(4): 19
Singh A P, Gordon G J. Relational learning via collective matrix factorization. In: Proceedings of the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 2008, 650–658
Pilászy I, Tikk D. Recommending new movies: even a few ratings are more valuable than metadata. In: Proceedings of the 3rd ACM Conference on Recommender Systems. 2009, 93–100
Abdollahpouri H, Burke R, Mobasher B. Controlling popularity bias in learning-to-rank recommendation. In: Proceedings of the 11th ACM Conference on Recommender Systems. 2017, 42–46
Marlow C, Naaman M, Boyd D, Davis M. HT06, tagging paper, taxonomy, flickr, academic article, to read. In: Proceedings of the 17th Conference on Hypertext and Hypermedia. 2006, 31–40
Gupta M, Li R, Yin Z, Han J. Survey on social tagging techniques. SIGKDD Explorations Newsletter, 2010, 12(1): 58–72
Tso-Sutter K H L, Marinho L B, Schmidt-Thieme L. Tag-aware recommender systems by fusion of collaborative filtering algorithms. In: Proceedings of the 2008 ACM Symposium on Applied Computing. 2008, 1995–1999
Bogers T, van den Bosch A. Collaborative and content-based filtering for item recommendation on social bookmarking websites. In: Proceedings of the Workshop on Recommender Systems and the Social Web. 2009, 9–16
Cai D, He X, Han J, Huang T S. Graph regularized nonnegative matrix factorization for data representation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2011, 33(8): 1548–1560
Zhou T C, Ma H, King I, Lyu M R. TagRec: leveraging tagging wisdom for recommendation. In: Proceedings of the 2009 International Conference on Computational Science and Engineering. 2009, 194–199
Zhen Y, Li W J, Yeung D Y. TagiCoFi: tag informed collaborative filtering. In: Proceedings of the 3rd ACM Conference on Recommender Systems. 2009, 69–76
Wu L, Chen E, Liu Q, Xu L, Bao T, Zhang L. Leveraging tagging for neighborhood-aware probabilistic matrix factorization. In: Proceedings of the 21st ACM International Conference on Information and Knowledge Management. 2012, 1854–1858
Rendle S. Factorization machines with libFM. ACM Transactions on Intelligent Systems and Technology, 2012, 3(3): 57
Chen T, Zhang W, Lu Q, Chen K, Zheng Z, Yu Y. SVDFeature: a toolkit for feature-based collaborative filtering. Journal of Machine Learning Research, 2012, 13(1): 3619–3622
Gantner Z, Drumond L, Freudenthaler C, Rendle S, Schmidt-Thieme L. Learning attribute-to-feature mappings for cold-start recommendations. In: Proceedings of the 2010 IEEE International Conference on Data Mining. 2010, 176–185
Cohen D, Aharon M, Koren Y, Somekh O, Nissim R. Expediting exploration by attribute-to-feature mapping for cold-start recommendations. In: Proceedings of the 11th ACM Conference on Recommender Systems. 2017, 184–192
Lian D, Ge Y, Zhang F, Yuan N J, Xie X, Zhou T, Rui Y. Content-aware collaborative filtering for location recommendation based on human mobility data. In: Proceedings of the 2015 IEEE International Conference on Data Mining. 2015, 261–270
Acknowledgements
This work was supported by the National Natural Science Foundation of China (Grant Nos. 61672322, 61672324), the Natural Science Foundation of Shandong Province (2016ZRE27468) and the Fundamental Research Funds of Shandong University.
Author information
Authors and Affiliations
Corresponding author
Additional information
Tao Lian is a lecturer with College of Data Science, Taiyuan University of Technology, China. Before that, he received his PhD and BE in computer science and technology from Shandong University, China in 2018 and 2011, respectively. His research interests include recommender systems, information retrieval, and data mining.
Lin Du is a senior student in Software College at Shandong University, China. He is an undergraduate research assistant in the Information Retrieval Lab at Shandong University, China. His research interests include recommender systems, data mining, and machine learning.
Mingfu Zhao is currently persuing his ME in Institute of Computing Technology, Chinese Academy of Sciences, China. Before that, he received his BE in software engineering from Shandong University, China in 2018. His research now focuses on big data and IoT.
Chaoran Cui received his PhD degree in computer science from Shandong University, China in 2015. Prior to that, he received his BE degree in software engineering from Shandong University, China in 2010. During 2015–2016, he was a research fellow at Singapore Management University, Singapore. He is now a professor with School of Computer Science and Technology, Shandong University of Finance and Economics, China. His research interests include information retrieval, recommender systems, multimedia, and machine learning.
Jun Ma received his BS, MS, and PhD degrees, all in computer science, from Shandong University, China in 1982, Ibaraki University, Japan in 1988, and Kyushu University, Japan in 1997, respectively. Currently he is a full professor with School of Computer Science and Technology at Shandong University, China. He was a visiting professor at the Department of Computer Science, Ibaraki University, Japan in 1994 and a senior researcher at Fraunhofer Institute, Germany from 1999 to 2003. His research interests include information retrieval, data mining, and natural language processing.
Zhumin Chen is an associate professor with School of Computer Science and Technology at Shandong University, China. He received his PhD degree in computer science from Shandong University, China in 2008. His research interests mainly include information retrieval, data mining, and social media processing.
Electronic Supplementary Material
Rights and permissions
About this article
Cite this article
Lian, T., Du, L., Zhao, M. et al. Evaluating and improving the interpretability of item embeddings using item-tag relevance information. Front. Comput. Sci. 14, 143603 (2020). https://doi.org/10.1007/s11704-019-7427-7
Received:
Accepted:
Published:
DOI: https://doi.org/10.1007/s11704-019-7427-7