research-article

Evaluating Pre-training Strategies for Collaborative Filtering

Authors:

Júlio B. G. Costa,

Leandro B. Marinho,

Rodrygo L. T. Santos,

Denis ParraAuthors Info & Claims

UMAP '23: Proceedings of the 31st ACM Conference on User Modeling, Adaptation and Personalization

Pages 175 - 182

https://doi.org/10.1145/3565472.3592949

Published: 19 June 2023 Publication History

Abstract

Pre-training is essential for effective representation learning models, especially in natural language processing and computer vision-related tasks. The core idea is to learn representations, usually through unsupervised or self-supervised approaches on large and generic source datasets, and use those pre-trained representations (aka embeddings) as initial parameter values during training on the target dataset. Seminal works in this area show that pre-training can act as a regularization mechanism placing the model parameters in regions of the optimization landscape closer to better local minima than random parameter initialization. However, no systematic studies evaluate the effectiveness of pre-training strategies on model-based collaborative filtering. This paper conducts a broad set of experiments to evaluate different pre-training strategies for collaborative filtering using Matrix Factorization (MF) as the base model. We show that such models equipped with pre-training in a transfer learning setting can vastly improve the prediction quality compared to the standard random parameter initialization baseline, reaching state-of-the-art results in standard recommender systems benchmarks. We also present alternatives for the out-of-vocabulary item problem (i.e., items present in target but not in source datasets) and show that pre-training in the context of MF acts as a regularizer, explaining the improvement in model generalization.

References

[1]

Yilmaz Ar. 2019. An initialization method for the latent vectors in probabilistic matrix factorization for sparse datasets. Evolutionary Intelligence (2019), 1–13.

[2]

Oren Barkan and Noam Koenigstein. 2016. Item2Vec: Neural Item Embedding for Collaborative Filtering. https://doi.org/10.48550/ARXIV.1603.04259

[3]

Dumitru Erhan, Yoshua Bengio, Aaron Courville, Pierre-Antoine Manzagol, Pascal Vincent, and Samy Bengio. 2010. Why Does Unsupervised Pre-training Help Deep Learning?Journal of Machine Learning Research 11, 19 (2010), 625–660. http://jmlr.org/papers/v11/erhan10a.html

[4]

Dumitru Erhan, Aaron Courville, Yoshua Bengio, and Pascal Vincent. 2010. Why Does Unsupervised Pre-training Help Deep Learning?. In Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics(Proceedings of Machine Learning Research, Vol. 9), Yee Whye Teh and Mike Titterington (Eds.). PMLR, Chia Laguna Resort, Sardinia, Italy, 201–208. https://proceedings.mlr.press/v9/erhan10a.html

[5]

Xavier Glorot and Yoshua Bengio. 2010. Understanding the difficulty of training deep feedforward neural networks. In Proceedings of the thirteenth international conference on artificial intelligence and statistics. JMLR Workshop and Conference Proceedings, 249–256.

[6]

Ian J. Goodfellow and Oriol Vinyals. 2015. Qualitatively characterizing neural network optimization problems. In 3rd International Conference on Learning Representations, ICLR 2015, San Diego, CA, USA, May 7-9, 2015, Conference Track Proceedings, Yoshua Bengio and Yann LeCun (Eds.). http://arxiv.org/abs/1412.6544

[7]

Mihajlo Grbovic, Vladan Radosavljevic, Nemanja Djuric, Narayan Bhamidipati, Jaikit Savla, Varun Bhagwan, and Doug Sharp. 2015. E-commerce in Your Inbox: Product Recommendations at Scale. In Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (Sydney, NSW, Australia) (KDD ’15). ACM, New York, NY, USA, 1809–1818. https://doi.org/10.1145/2783258.2788627

Digital Library

[8]

Soyeon Caren Han, Taejun Lim, Siqu Long, Bernd Burgstaller, and Josiah Poon. 2021. GLocal-K: Global and Local Kernels for Recommender Systems. In CIKM 2021 - Proceedings of the 30th ACM International Conference on Information and Knowledge Management(International Conference on Information and Knowledge Management, Proceedings). Association for Computing Machinery, 3063–3067. https://doi.org/10.1145/3459637.3482112 Publisher Copyright: © 2021 ACM.; 30th ACM International Conference on Information and Knowledge Management, CIKM 2021 ; Conference date: 01-11-2021 Through 05-11-2021.

Digital Library

[9]

F. Maxwell Harper and Joseph A. Konstan. 2015. The MovieLens Datasets: History and Context. ACM Trans. Interact. Intell. Syst. 5, 4, Article 19 (Dec. 2015), 19 pages. https://doi.org/10.1145/2827872

Digital Library

[10]

Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2015. Delving deep into rectifiers: Surpassing human-level performance on imagenet classification. In Proceedings of the IEEE international conference on computer vision. 1026–1034.

Digital Library

[11]

Balázs Hidasi and Domonkos Tikk. 2012. Enhancing matrix factorization through initialization for implicit feedback databases. In Proceedings of the 2nd Workshop on Context-awareness in Retrieval and Recommendation. 2–9.

Digital Library

[12]

G. E. Hinton and R. R. Salakhutdinov. 2006. Reducing the Dimensionality of Data with Neural Networks. Science 313, 5786 (2006), 504–507. https://doi.org/10.1126/science.1127647 arXiv:https://www.science.org/doi/pdf/10.1126/science.1127647

[13]

Yehuda Koren, Robert Bell, and Chris Volinsky. 2009. Matrix factorization techniques for recommender systems. Computer 42, 8 (2009), 30–37.

Digital Library

[14]

Yan Leng, Rodrigo Ruiz, and Xiao Liu. 2020. Geometric deep learning based recommender system and an interpretable decision support system. Available at SSRN 3696092 (2020).

[15]

Leandro Balby Marinho, Júlio Barreto Guedes da Costa, Denis Parra, and Rodrygo L. T. Santos. 2022. Similarity-Based Explanations Meet Matrix Factorization via Structure-Preserving Embeddings. In 27th International Conference on Intelligent User Interfaces (Helsinki, Finland) (IUI ’22). Association for Computing Machinery, New York, NY, USA, 782–793. https://doi.org/10.1145/3490099.3511104

Digital Library

[16]

Tomas Mikolov, Kai Chen, Greg Corrado, and Jeffrey Dean. 2013. Efficient Estimation of Word Representations in Vector Space. https://doi.org/10.48550/ARXIV.1301.3781

[17]

Maxime Oquab, Leon Bottou, Ivan Laptev, and Josef Sivic. 2014. Learning and Transferring Mid-Level Image Representations using Convolutional Neural Networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

Digital Library

[18]

Steffen Rendle, Walid Krichene, Li Zhang, and John Anderson. 2020. Neural Collaborative Filtering vs. Matrix Factorization Revisited. In Fourteenth ACM Conference on Recommender Systems (Virtual Event, Brazil) (RecSys ’20). Association for Computing Machinery, New York, NY, USA, 240–248. https://doi.org/10.1145/3383313.3412488

Digital Library

[19]

Steffen Rendle, Walid Krichene, Li Zhang, and Yehuda Koren. 2021. Revisiting the Performance of iALS on Item Recommendation Benchmarks. https://doi.org/10.48550/ARXIV.2110.14037

[20]

Steffen Rendle, Li Zhang, and Yehuda Koren. 2019. On the Difficulty of Evaluating Baselines: A Study on Recommender Systems. CoRR abs/1905.01395 (2019). arXiv:1905.01395http://arxiv.org/abs/1905.01395

[21]

Olga Russakovsky, Jia Deng, Hao Su, Jonathan Krause, Sanjeev Satheesh, Sean Ma, Zhiheng Huang, Andrej Karpathy, Aditya Khosla, Michael Bernstein, Alexander C. Berg, and Li Fei-Fei. 2015. ImageNet Large Scale Visual Recognition Challenge. International Journal of Computer Vision (IJCV) 115, 3 (2015), 211–252.

Digital Library

[22]

Mathias Seuret, Michele Alberti, Marcus Liwicki, and Rolf Ingold. 2017. PCA-initialized deep neural networks applied to document image analysis. In 2017 14th IAPR international conference on document analysis and recognition (ICDAR), Vol. 1. IEEE, 877–882.

[23]

Ali Sharif Razavian, Hossein Azizpour, Josephine Sullivan, and Stefan Carlsson. 2014. CNN features off-the-shelf: an astounding baseline for recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition workshops. 806–813.

[24]

Chen Wang, Yueqing Liang, Zhiwei Liu, Tao Zhang, and Philip S Yu. 2021. Pre-training Graph Neural Network for Cross Domain Recommendation. arXiv preprint arXiv:2111.08268 (2021).

[25]

Zahra Zamanzadeh Darban and Mohammad Hadi Valipour. 2022. GHRS: Graph-based hybrid recommendation system with application to movie recommendation. Expert Systems with Applications 200 (2022), 116850. https://doi.org/10.1016/j.eswa.2022.116850

Digital Library

[26]

Fuzhen Zhuang, Zhiyuan Qi, Keyu Duan, Dongbo Xi, Yongchun Zhu, Hengshu Zhu, Hui Xiong, and Qing He. 2021. A Comprehensive Survey on Transfer Learning. Proc. IEEE 109, 1 (2021), 43–76. https://doi.org/10.1109/JPROC.2020.3004555

Index Terms

Evaluating Pre-training Strategies for Collaborative Filtering
1. Information systems
  1. Information retrieval
    1. Retrieval tasks and goals
      1. Recommender systems

Recommendations

Collaborative filtering with collective training
RecSys '11: Proceedings of the fifth ACM conference on Recommender systems

Rating sparsity is a critical issue for collaborative filtering. For example, the well-known Netflix Movie rating data contain ratings of only about 1% user-item pairs. One way to address this rating sparsity problem is to develop more effective methods ...
Trust-based collaborative filtering: tackling the cold start problem using regular equivalence
RecSys '18: Proceedings of the 12th ACM Conference on Recommender Systems

User-based Collaborative Filtering (CF) is one of the most popular approaches to create recommender systems. This approach is based on finding the most relevant k users from whose rating history we can extract items to recommend. CF, however, suffers ...
Exploiting user interests for collaborative filtering: interests expansion via personalized ranking
CIKM '10: Proceedings of the 19th ACM international conference on Information and knowledge management

In real applications, a given user buys or rates an item based on his/her interests. Learning to leverage this interest information is often critical for recommender systems. However, in existing recommender systems, the information about latent user ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

UMAP '23: Proceedings of the 31st ACM Conference on User Modeling, Adaptation and Personalization

June 2023

333 pages

ISBN:9781450399326

DOI:10.1145/3565472

Copyright © 2023 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 19 June 2023

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article
Research
Refereed limited

Conference

UMAP '23

Sponsor:

UMAP '23: 31st ACM Conference on User Modeling, Adaptation and Personalization

June 26 - 29, 2023

Limassol, Cyprus

Acceptance Rates

Overall Acceptance Rate 162 of 633 submissions, 26%

Upcoming Conference

UMAP '25

Sponsor:
sigchi
sigchi

33rd ACM Conference on User Modeling, Adaptation and Personalization

June 16 - 19, 2025

New York City , NY , USA

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
131
Total Downloads

Downloads (Last 12 months)29
Downloads (Last 6 weeks)1

Reflects downloads up to 18 Feb 2025

Other Metrics

View Author Metrics

Citations

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

HTML Format

View this article in HTML Format.

Figures

Tables

Media

View Table of Conten