Skip to main content
Log in

Evaluating and improving the interpretability of item embeddings using item-tag relevance information

  • Research Article
  • Published:
Frontiers of Computer Science Aims and scope Submit manuscript

Abstract

Matrix factorization (MF) methods have superior recommendation performance and are flexible to incorporate other side information, but it is hard for humans to interpret the derived latent factors. Recently, the item-item cooccurrence information is exploited to learn item embeddings and enhance the recommendation performance. However, the item-item co-occurrence information, constructed from the sparse and long-tail distributed user-item interaction matrix, is over-estimated for rare items, which could lead to bias in learned item embeddings. In this paper, we seek to evaluate and improve the interpretability of item embeddings by leveraging a dense item-tag relevance matrix. Specifically, we design two metrics to quantitatively evaluate the interpretability of item embeddings from different viewpoints: interpretability of individual dimensions of item embeddings and semantic coherence of local neighborhoods in the latent space. We also propose a tag-informed item embedding (TIE) model that jointly factorizes the user-item interaction matrix, the item-item co-occurrence matrix and the item-tag relevance matrix with shared item embeddings so that different forms of information can co-operate with each other to learn better item embeddings. Experiments on the MovieLens20M dataset demonstrate that compared with other state-of-the-art MF methods, TIE achieves better top-N recommendations, and the relative improvement is larger when the user-item interaction matrix becomes sparser. By leveraging the itemtag relevance information, individual dimensions of item embeddings are more interpretable and local neighborhoods in the latent space are more semantically coherent; the bias in learned item embeddings are also mitigated to some extent.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Salakhutdinov R, Mnih A. Probabilistic matrix factorization. In: Proceedings of the 20th International Conference on Neural Information Processing Systems. 2007, 1257–1264

  2. Hu Y, Koren Y, Volinsky C. Collaborative filtering for implicit feedback datasets. In: Proceedings of the 2008 IEEE International Conference on Data Mining. 2008, 263–272

  3. Pan R, Zhou Y, Cao B, Liu N N, Lukose R, Scholz M, Yang Q. One-class collaborative filtering. In: Proceedings of the 2008 IEEE International Conference on Data Mining. 2008, 502–511

  4. Koren Y, Bell R, Volinsky C. Matrix factorization techniques for recommender systems. Computer, 2009, 42(8): 30–37

    Article  Google Scholar 

  5. Mikolov T, Sutskever I, Chen K, Corrado G, Dean J. Distributed representations of words and phrases and their compositionality. In: Proceedings of the 26th International Conference on Neural Information Processing Systems. 2013, 3111–3119

  6. Levy O, Goldberg Y. Neural word embedding as implicit matrix factorization. In: Proceedings of the 27th International Conference on Neural Information Processing Systems. 2014, 2177–2185

  7. Pennington J, Socher R, Manning C D. GloVe: global vectors for word representation. In: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing. 2014, 1532–1543

  8. Zhou N, Zhao W X, Zhang X, Wen J R, Wang S. A general multi-context embedding model for mining human trajectory data. IEEE Transactions on Knowledge and Data Engineering, 2016, 28(8): 1945–1958

    Article  Google Scholar 

  9. Liang D, Altosaar J, Charlin L, Blei D M. Factorization meets the item embedding: regularizing matrix factorization with item co-occurrence. In: Proceedings of the 10th ACM Conference on Recommender Systems. 2016, 59–66

  10. Park C, Kim D, Oh J, Yu H. Do “also-viewed” products help user rating prediction? In: Proceedings of the 26th International Conference on World Wide Web. 2017, 1113–1122

  11. Cao D, Nie L, He X, Wei X, Zhu S, Chua T S. Embedding factorization models for jointly recommending items and user generated lists. In: Proceedings of the 40th International ACM SIGIR Conference on Research and Development in Information Retrieval. 2017, 585–594

  12. Turney P D, Pantel P. From frequency to meaning: vector space models of semantics. Journal of Artificial Intelligence Research, 2010, 37: 141–188

    Article  MathSciNet  Google Scholar 

  13. Vig J, Sen S, Riedl J. The tag genome: encoding community knowledge to support novel interaction. ACM Transactions on Interactive Intelligent Systems, 2012, 2(3): 13

    Article  Google Scholar 

  14. Chang J, Boyd-Graber J, Gerrish S, Wang C, Blei D M. Reading tea leaves: how humans interpret topic models. In: Proceedings of the 22nd International Conference on Neural Information Processing Systems. 2009, 288–296

  15. Murphy B, Talukdar P P, Mitchell T. Learning effective and interpretable semantic models using non-negative sparse embedding. In: Proceedings of the 24th International Conference on Computational Linguistics. 2012, 1933–1950

  16. Faruqui M, Tsvetkov Y, Yogatama D, Dyer C, Smith N A. Sparse over-complete word vector representations. In: Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing. 2015, 1491–1500

  17. Sun F, Guo J, Lan Y, Xu J, Cheng X. Sparse word embeddings using 1 regularized online learning. In: Proceedings of the 25th International Joint Conference on Artificial Intelligence. 2016, 2915–2921

  18. Cremonesi P, Koren Y, Turrin R. Performance of recommender algorithms on top-n recommendation tasks. In: Proceedings of the 4th ACM Conference on Recommender Systems. 2010, 39–46

  19. He X, Zhang H, Kan M Y, Chua T S. Fast matrix factorization for online recommendation with implicit feedback. In: Proceedings of the 39th International ACM SIGIR Conference on Research and Development in Information Retrieval. 2016, 549–558

  20. Lian D, Ge Y, Zhang F, Yuan N J, Xie X, Zhou T, Rui Y. Scalable content-aware collaborative filtering for location recommendation. IEEE Transactions on Knowledge and Data Engineering, 2018, 30(6): 1122–1135

    Article  Google Scholar 

  21. Anderson C. The long tail. Wired Magazine, 2004, 12(10): 170–177

    Google Scholar 

  22. Sen S, Harper F M, LaPitz A, Riedl J. The quest for quality tags. In: Proceedings of the 2007 International ACM Conference on Supporting Group Work. 2007, 361–370

  23. Yu H F, Hsieh C J, Si S, Dhillon I. Scalable coordinate descent approaches to parallel matrix factorization for recommender systems. In: Proceedings of the 2012 IEEE International Conference on Data Mining. 2012, 765–774

  24. Levy M, Sandler M. A semantic space for music derived from social tags. In: Proceedings of the 8th International Conference on Music Information Retrieval. 2007, 411–416

  25. Sinha R, Swearingen K. The role of transparency in recommender systems. In: Proceedings of the 2002 Conference on Human Factors in Computing Systems. 2002, 830–831

  26. Harper F M, Konstan J A. The movielens datasets: history and context. ACM Transactions on Interactive Intelligent Systems, 2015, 5(4): 19

    Article  Google Scholar 

  27. Singh A P, Gordon G J. Relational learning via collective matrix factorization. In: Proceedings of the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 2008, 650–658

  28. Pilászy I, Tikk D. Recommending new movies: even a few ratings are more valuable than metadata. In: Proceedings of the 3rd ACM Conference on Recommender Systems. 2009, 93–100

  29. Abdollahpouri H, Burke R, Mobasher B. Controlling popularity bias in learning-to-rank recommendation. In: Proceedings of the 11th ACM Conference on Recommender Systems. 2017, 42–46

  30. Marlow C, Naaman M, Boyd D, Davis M. HT06, tagging paper, taxonomy, flickr, academic article, to read. In: Proceedings of the 17th Conference on Hypertext and Hypermedia. 2006, 31–40

  31. Gupta M, Li R, Yin Z, Han J. Survey on social tagging techniques. SIGKDD Explorations Newsletter, 2010, 12(1): 58–72

    Article  Google Scholar 

  32. Tso-Sutter K H L, Marinho L B, Schmidt-Thieme L. Tag-aware recommender systems by fusion of collaborative filtering algorithms. In: Proceedings of the 2008 ACM Symposium on Applied Computing. 2008, 1995–1999

  33. Bogers T, van den Bosch A. Collaborative and content-based filtering for item recommendation on social bookmarking websites. In: Proceedings of the Workshop on Recommender Systems and the Social Web. 2009, 9–16

  34. Cai D, He X, Han J, Huang T S. Graph regularized nonnegative matrix factorization for data representation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2011, 33(8): 1548–1560

    Article  Google Scholar 

  35. Zhou T C, Ma H, King I, Lyu M R. TagRec: leveraging tagging wisdom for recommendation. In: Proceedings of the 2009 International Conference on Computational Science and Engineering. 2009, 194–199

  36. Zhen Y, Li W J, Yeung D Y. TagiCoFi: tag informed collaborative filtering. In: Proceedings of the 3rd ACM Conference on Recommender Systems. 2009, 69–76

  37. Wu L, Chen E, Liu Q, Xu L, Bao T, Zhang L. Leveraging tagging for neighborhood-aware probabilistic matrix factorization. In: Proceedings of the 21st ACM International Conference on Information and Knowledge Management. 2012, 1854–1858

  38. Rendle S. Factorization machines with libFM. ACM Transactions on Intelligent Systems and Technology, 2012, 3(3): 57

    Article  Google Scholar 

  39. Chen T, Zhang W, Lu Q, Chen K, Zheng Z, Yu Y. SVDFeature: a toolkit for feature-based collaborative filtering. Journal of Machine Learning Research, 2012, 13(1): 3619–3622

    MathSciNet  MATH  Google Scholar 

  40. Gantner Z, Drumond L, Freudenthaler C, Rendle S, Schmidt-Thieme L. Learning attribute-to-feature mappings for cold-start recommendations. In: Proceedings of the 2010 IEEE International Conference on Data Mining. 2010, 176–185

  41. Cohen D, Aharon M, Koren Y, Somekh O, Nissim R. Expediting exploration by attribute-to-feature mapping for cold-start recommendations. In: Proceedings of the 11th ACM Conference on Recommender Systems. 2017, 184–192

  42. Lian D, Ge Y, Zhang F, Yuan N J, Xie X, Zhou T, Rui Y. Content-aware collaborative filtering for location recommendation based on human mobility data. In: Proceedings of the 2015 IEEE International Conference on Data Mining. 2015, 261–270

Download references

Acknowledgements

This work was supported by the National Natural Science Foundation of China (Grant Nos. 61672322, 61672324), the Natural Science Foundation of Shandong Province (2016ZRE27468) and the Fundamental Research Funds of Shandong University.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Zhumin Chen.

Additional information

Tao Lian is a lecturer with College of Data Science, Taiyuan University of Technology, China. Before that, he received his PhD and BE in computer science and technology from Shandong University, China in 2018 and 2011, respectively. His research interests include recommender systems, information retrieval, and data mining.

Lin Du is a senior student in Software College at Shandong University, China. He is an undergraduate research assistant in the Information Retrieval Lab at Shandong University, China. His research interests include recommender systems, data mining, and machine learning.

Mingfu Zhao is currently persuing his ME in Institute of Computing Technology, Chinese Academy of Sciences, China. Before that, he received his BE in software engineering from Shandong University, China in 2018. His research now focuses on big data and IoT.

Chaoran Cui received his PhD degree in computer science from Shandong University, China in 2015. Prior to that, he received his BE degree in software engineering from Shandong University, China in 2010. During 2015–2016, he was a research fellow at Singapore Management University, Singapore. He is now a professor with School of Computer Science and Technology, Shandong University of Finance and Economics, China. His research interests include information retrieval, recommender systems, multimedia, and machine learning.

Jun Ma received his BS, MS, and PhD degrees, all in computer science, from Shandong University, China in 1982, Ibaraki University, Japan in 1988, and Kyushu University, Japan in 1997, respectively. Currently he is a full professor with School of Computer Science and Technology at Shandong University, China. He was a visiting professor at the Department of Computer Science, Ibaraki University, Japan in 1994 and a senior researcher at Fraunhofer Institute, Germany from 1999 to 2003. His research interests include information retrieval, data mining, and natural language processing.

Zhumin Chen is an associate professor with School of Computer Science and Technology at Shandong University, China. He received his PhD degree in computer science from Shandong University, China in 2008. His research interests mainly include information retrieval, data mining, and social media processing.

Electronic Supplementary Material

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Lian, T., Du, L., Zhao, M. et al. Evaluating and improving the interpretability of item embeddings using item-tag relevance information. Front. Comput. Sci. 14, 143603 (2020). https://doi.org/10.1007/s11704-019-7427-7

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s11704-019-7427-7

Keywords

Navigation