skip to main content
10.1145/3565472.3592949acmconferencesArticle/Chapter ViewAbstractPublication PagesumapConference Proceedingsconference-collections
research-article

Evaluating Pre-training Strategies for Collaborative Filtering

Published: 19 June 2023 Publication History

Abstract

Pre-training is essential for effective representation learning models, especially in natural language processing and computer vision-related tasks. The core idea is to learn representations, usually through unsupervised or self-supervised approaches on large and generic source datasets, and use those pre-trained representations (aka embeddings) as initial parameter values during training on the target dataset. Seminal works in this area show that pre-training can act as a regularization mechanism placing the model parameters in regions of the optimization landscape closer to better local minima than random parameter initialization. However, no systematic studies evaluate the effectiveness of pre-training strategies on model-based collaborative filtering. This paper conducts a broad set of experiments to evaluate different pre-training strategies for collaborative filtering using Matrix Factorization (MF) as the base model. We show that such models equipped with pre-training in a transfer learning setting can vastly improve the prediction quality compared to the standard random parameter initialization baseline, reaching state-of-the-art results in standard recommender systems benchmarks. We also present alternatives for the out-of-vocabulary item problem (i.e., items present in target but not in source datasets) and show that pre-training in the context of MF acts as a regularizer, explaining the improvement in model generalization.

References

[1]
Yilmaz Ar. 2019. An initialization method for the latent vectors in probabilistic matrix factorization for sparse datasets. Evolutionary Intelligence (2019), 1–13.
[2]
Oren Barkan and Noam Koenigstein. 2016. Item2Vec: Neural Item Embedding for Collaborative Filtering. https://doi.org/10.48550/ARXIV.1603.04259
[3]
Dumitru Erhan, Yoshua Bengio, Aaron Courville, Pierre-Antoine Manzagol, Pascal Vincent, and Samy Bengio. 2010. Why Does Unsupervised Pre-training Help Deep Learning?Journal of Machine Learning Research 11, 19 (2010), 625–660. http://jmlr.org/papers/v11/erhan10a.html
[4]
Dumitru Erhan, Aaron Courville, Yoshua Bengio, and Pascal Vincent. 2010. Why Does Unsupervised Pre-training Help Deep Learning?. In Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics(Proceedings of Machine Learning Research, Vol. 9), Yee Whye Teh and Mike Titterington (Eds.). PMLR, Chia Laguna Resort, Sardinia, Italy, 201–208. https://proceedings.mlr.press/v9/erhan10a.html
[5]
Xavier Glorot and Yoshua Bengio. 2010. Understanding the difficulty of training deep feedforward neural networks. In Proceedings of the thirteenth international conference on artificial intelligence and statistics. JMLR Workshop and Conference Proceedings, 249–256.
[6]
Ian J. Goodfellow and Oriol Vinyals. 2015. Qualitatively characterizing neural network optimization problems. In 3rd International Conference on Learning Representations, ICLR 2015, San Diego, CA, USA, May 7-9, 2015, Conference Track Proceedings, Yoshua Bengio and Yann LeCun (Eds.). http://arxiv.org/abs/1412.6544
[7]
Mihajlo Grbovic, Vladan Radosavljevic, Nemanja Djuric, Narayan Bhamidipati, Jaikit Savla, Varun Bhagwan, and Doug Sharp. 2015. E-commerce in Your Inbox: Product Recommendations at Scale. In Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (Sydney, NSW, Australia) (KDD ’15). ACM, New York, NY, USA, 1809–1818. https://doi.org/10.1145/2783258.2788627
[8]
Soyeon Caren Han, Taejun Lim, Siqu Long, Bernd Burgstaller, and Josiah Poon. 2021. GLocal-K: Global and Local Kernels for Recommender Systems. In CIKM 2021 - Proceedings of the 30th ACM International Conference on Information and Knowledge Management(International Conference on Information and Knowledge Management, Proceedings). Association for Computing Machinery, 3063–3067. https://doi.org/10.1145/3459637.3482112 Publisher Copyright: © 2021 ACM.; 30th ACM International Conference on Information and Knowledge Management, CIKM 2021 ; Conference date: 01-11-2021 Through 05-11-2021.
[9]
F. Maxwell Harper and Joseph A. Konstan. 2015. The MovieLens Datasets: History and Context. ACM Trans. Interact. Intell. Syst. 5, 4, Article 19 (Dec. 2015), 19 pages. https://doi.org/10.1145/2827872
[10]
Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2015. Delving deep into rectifiers: Surpassing human-level performance on imagenet classification. In Proceedings of the IEEE international conference on computer vision. 1026–1034.
[11]
Balázs Hidasi and Domonkos Tikk. 2012. Enhancing matrix factorization through initialization for implicit feedback databases. In Proceedings of the 2nd Workshop on Context-awareness in Retrieval and Recommendation. 2–9.
[12]
G. E. Hinton and R. R. Salakhutdinov. 2006. Reducing the Dimensionality of Data with Neural Networks. Science 313, 5786 (2006), 504–507. https://doi.org/10.1126/science.1127647 arXiv:https://www.science.org/doi/pdf/10.1126/science.1127647
[13]
Yehuda Koren, Robert Bell, and Chris Volinsky. 2009. Matrix factorization techniques for recommender systems. Computer 42, 8 (2009), 30–37.
[14]
Yan Leng, Rodrigo Ruiz, and Xiao Liu. 2020. Geometric deep learning based recommender system and an interpretable decision support system. Available at SSRN 3696092 (2020).
[15]
Leandro Balby Marinho, Júlio Barreto Guedes da Costa, Denis Parra, and Rodrygo L. T. Santos. 2022. Similarity-Based Explanations Meet Matrix Factorization via Structure-Preserving Embeddings. In 27th International Conference on Intelligent User Interfaces (Helsinki, Finland) (IUI ’22). Association for Computing Machinery, New York, NY, USA, 782–793. https://doi.org/10.1145/3490099.3511104
[16]
Tomas Mikolov, Kai Chen, Greg Corrado, and Jeffrey Dean. 2013. Efficient Estimation of Word Representations in Vector Space. https://doi.org/10.48550/ARXIV.1301.3781
[17]
Maxime Oquab, Leon Bottou, Ivan Laptev, and Josef Sivic. 2014. Learning and Transferring Mid-Level Image Representations using Convolutional Neural Networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[18]
Steffen Rendle, Walid Krichene, Li Zhang, and John Anderson. 2020. Neural Collaborative Filtering vs. Matrix Factorization Revisited. In Fourteenth ACM Conference on Recommender Systems (Virtual Event, Brazil) (RecSys ’20). Association for Computing Machinery, New York, NY, USA, 240–248. https://doi.org/10.1145/3383313.3412488
[19]
Steffen Rendle, Walid Krichene, Li Zhang, and Yehuda Koren. 2021. Revisiting the Performance of iALS on Item Recommendation Benchmarks. https://doi.org/10.48550/ARXIV.2110.14037
[20]
Steffen Rendle, Li Zhang, and Yehuda Koren. 2019. On the Difficulty of Evaluating Baselines: A Study on Recommender Systems. CoRR abs/1905.01395 (2019). arXiv:1905.01395http://arxiv.org/abs/1905.01395
[21]
Olga Russakovsky, Jia Deng, Hao Su, Jonathan Krause, Sanjeev Satheesh, Sean Ma, Zhiheng Huang, Andrej Karpathy, Aditya Khosla, Michael Bernstein, Alexander C. Berg, and Li Fei-Fei. 2015. ImageNet Large Scale Visual Recognition Challenge. International Journal of Computer Vision (IJCV) 115, 3 (2015), 211–252.
[22]
Mathias Seuret, Michele Alberti, Marcus Liwicki, and Rolf Ingold. 2017. PCA-initialized deep neural networks applied to document image analysis. In 2017 14th IAPR international conference on document analysis and recognition (ICDAR), Vol. 1. IEEE, 877–882.
[23]
Ali Sharif Razavian, Hossein Azizpour, Josephine Sullivan, and Stefan Carlsson. 2014. CNN features off-the-shelf: an astounding baseline for recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition workshops. 806–813.
[24]
Chen Wang, Yueqing Liang, Zhiwei Liu, Tao Zhang, and Philip S Yu. 2021. Pre-training Graph Neural Network for Cross Domain Recommendation. arXiv preprint arXiv:2111.08268 (2021).
[25]
Zahra Zamanzadeh Darban and Mohammad Hadi Valipour. 2022. GHRS: Graph-based hybrid recommendation system with application to movie recommendation. Expert Systems with Applications 200 (2022), 116850. https://doi.org/10.1016/j.eswa.2022.116850
[26]
Fuzhen Zhuang, Zhiyuan Qi, Keyu Duan, Dongbo Xi, Yongchun Zhu, Hengshu Zhu, Hui Xiong, and Qing He. 2021. A Comprehensive Survey on Transfer Learning. Proc. IEEE 109, 1 (2021), 43–76. https://doi.org/10.1109/JPROC.2020.3004555

Index Terms

  1. Evaluating Pre-training Strategies for Collaborative Filtering

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    UMAP '23: Proceedings of the 31st ACM Conference on User Modeling, Adaptation and Personalization
    June 2023
    333 pages
    ISBN:9781450399326
    DOI:10.1145/3565472
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 19 June 2023

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. collaborative filtering
    2. model initialization
    3. transfer learning

    Qualifiers

    • Research-article
    • Research
    • Refereed limited

    Conference

    UMAP '23
    Sponsor:

    Acceptance Rates

    Overall Acceptance Rate 162 of 633 submissions, 26%

    Upcoming Conference

    UMAP '25

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • 0
      Total Citations
    • 131
      Total Downloads
    • Downloads (Last 12 months)29
    • Downloads (Last 6 weeks)1
    Reflects downloads up to 18 Feb 2025

    Other Metrics

    Citations

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    HTML Format

    View this article in HTML Format.

    HTML Format

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media