skip to main content
10.1145/3565472.3595630acmconferencesArticle/Chapter ViewAbstractPublication PagesumapConference Proceedingsconference-collections
extended-abstract

Scalable and Explainable Linear Shallow Autoencoders for Collaborative Filtering from Industrial Perspective

Published: 19 June 2023 Publication History

Abstract

The popularity of linear shallow autoencoders for collaborative filtering is growing in the research community, and internet industry providers of Recommender Systems are also taking notice. However, despite their simplicity and accuracy, these models often cannot be used in real-world industrial recommender systems due to their inability to scale to very large interaction matrices. Our research aims to address this issue by developing a scalable, explainable, and accurate shallow linear autoencoder method for collaborative filtering that meets the demands of real-world recommenders. In this paper, we present our industrial Ph.D. research project, which includes: (1) the development of a scalable method called ELSA and the adaptation of the method to a large real-world recommender and (2) the creation of a framework to visualize the recommender systems insights based on modeling the distribution of retrieval metrics in latent user space. We discuss the current status of our project, the key steps to finish the project, and the possible future extensions after the dissertation.

References

[1]
Home of the first website 1991. Home of the first website: World Wide Web. Home of the first website. http://info.cern.ch/hypertext/WWW/TheProject.html
[2]
The Music Network 2020. The Music Network: Daniel Ek on expanding Spotify in 2020. The Music Network. https://themusicnetwork.com/daniel-ek-spotify-in-2020/
[3]
Internet Live Stats 2023. Internet Live Stats - Internet Usage & Social Media Statistics. Internet Live Stats. https://www.internetlivestats.com/
[4]
Statista 2023. Statista - The Statistics Portal for Market Data, Market Research and Market Studies. Statista. https://www.statista.com/
[5]
Vito Walter Anelli, Alejandro Bellogin, Antonio Ferrara, Daniele Malitesta, Felice Antonio Merra, Claudio Pomo, Francesco Maria Donini, and Tommaso Di Noia. 2021-07-11. Elliot. Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval (2021-07-11), 2405–2414. https://doi.org/10.1145/3404835.3463245
[6]
Ria Banerjee, Preeti Kathiria, and Deepika Shukla. 2020. Recommendation Systems Based on Collaborative Filtering Using Autoencoders: Issues and Opportunities. In The International Conference on Recent Innovations in Computing. Springer, 391–405.
[7]
James Bennett and Stan Lanning. 2007. The Netflix Prize. KDD Cup and Workshop (2007). https://doi.org/10.1145/1562764.1562769
[8]
Jon Louis Bentley. 1975. Multidimensional binary search trees used for associative searching. Commun. ACM 18, 9 (1975), 509–517.
[9]
Ludovik Coba, Roberto Confalonieri, and Markus Zanker. 2022. RecoXplainer. IEEE Computational Intelligence Magazine 17, 1 (2022), 46–58. https://doi.org/10.1109/MCI.2021.3129958
[10]
Paolo Cremonesi, Yehuda Koren, and Roberto Turrin. 2010. Performance of recommender algorithms on top-N recommendation tasks. In RecSys’10 - Proceedings of the 4th ACM Conference on Recommender Systems. https://doi.org/10.1145/1864708.1864721
[11]
Gabriel de Souza Pereira Moreira, Sara Rabhi, Jeong Min Lee, Ronay Ak, and Even Oldridge. 2021. Transformers4Rec: Bridging the Gap between NLP and Sequential / Session-Based Recommendation. In Proceedings of the 15th ACM Conference on Recommender Systems (Amsterdam, Netherlands) (RecSys ’21). Association for Computing Machinery, New York, NY, USA, 143–153. https://doi.org/10.1145/3460231.3474255
[12]
Ramesh Dommeti. 2009. Neighborhood based methods for collaborative filtering. A Case Study, I (2009), 1–5.
[13]
Robin Ian MacDonald Dunbar. 1998. Grooming, gossip, and the evolution of language. Harvard University Press.
[14]
Michael D. Ekstrand, Ben Carterette, and Fernando Diaz. 2021. Evaluating Recommenders with Distributions. RecSys Workshop on Perspectives on Evaluation (2021).
[15]
Michael D. Ekstrand, Michael Ludwig, Joseph A. Konstan, and John T. Riedl. 2011. Rethinking the recommender research ecosystem. Proceedings of the fifth ACM conference on Recommender systems - RecSys ’11 (2011), 133–140.
[16]
Evelyn Fix. 1985. Discriminatory analysis: nonparametric discrimination, consistency properties. Vol. 1. USAF school of Aviation Medicine.
[17]
Ben Frederickson. 2019. Fast Python Collaborative Filtering for Implicit Datasets.\ url{https://github.com/benfred/implicit}.
[18]
David Goldberg, David Nichols, Brian M. Oki, and Douglas Terry. 1992. Using Collaborative Filtering to Weave an Information Tapestry. Commun. ACM 35, 12 (dec 1992), 61–70. https://doi.org/10.1145/138859.138867
[19]
Guibing Guo, Jie Zhang, Zhu Sun, and Neil Yorke-Smith. 2015. LibRec. In UMAP Workshops, Vol. 4. Citeseer.
[20]
Udit Gupta, Carole Jean Wu, Xiaodong Wang, Maxim Naumov, Brandon Reagen, David Brooks, Bradford Cottel, Kim Hazelwood, Mark Hempstead, Bill Jia, Hsien Hsin S. Lee, Andrey Malevich, Dheevatsa Mudigere, Mikhail Smelyanskiy, Liang Xiong, and Xuan Zhang. 2020. The architectural implications of facebook’s DNN-based personalized recommendation. Proceedings - 2020 IEEE International Symposium on High Performance Computer Architecture, HPCA 2020 3, Hpca (2020), 488–501. https://doi.org/10.1109/HPCA47549.2020.00047 arxiv:1906.03109
[21]
F. Maxwell Harper and Joseph A. Konstan. 2015. The MovieLens Datasets: History and Context. ACM Trans. Interact. Intell. Syst. 5, 4, Article 19 (Dec. 2015), 19 pages. https://doi.org/10.1145/2827872
[22]
Xiangnan He, Lizi Liao, Hanwang Zhang, Liqiang Nie, Xia Hu, and Tat-Seng Chua. 2017. Neural collaborative filtering. In Proceedings of the 26th international conference on world wide web. 173–182.
[23]
Balázs Hidasi, Alexandros Karatzoglou, Linas Baltrunas, and Domonkos Tikk. 2015. Session-based recommendations with recurrent neural networks. arXiv preprint arXiv:1511.06939 (2015).
[24]
Karl Higley, Even Oldridge, Ronay Ak, Sara Rabhi, and Gabriel de Souza Pereira Moreira. 2022. Building and Deploying a Multi-Stage Recommender System with Merlin. In Proceedings of the 16th ACM Conference on Recommender Systems (Seattle, WA, USA) (RecSys ’22). Association for Computing Machinery, New York, NY, USA, 632–635. https://doi.org/10.1145/3523227.3551468
[25]
Piotr Indyk and Rajeev Motwani. 1998. Approximate nearest neighbors: towards removing the curse of dimensionality. In Proceedings of the thirtieth annual ACM symposium on Theory of computing. 604–613.
[26]
Daeryong Kim and Bongwon Suh. 2019. Enhancing VAEs for collaborative filtering: Flexible priors & gating mechanisms. In RecSys 2019 - 13th ACM Conference on Recommender Systems. https://doi.org/10.1145/3298689.3347015
[27]
Diederik P. Kingma and Max Welling. 2013. Auto-encoding variational bayes. arXiv preprint arXiv:1312.6114 (2013). https://arxiv.org/abs/1312.6114
[28]
Joseph A. Konstan, Bradley N. Miller, David Maltz, Jonathan L. Herlocker, Lee R. Gordon, and John Riedl. 1997. GroupLens. Commun. ACM 40, 3 (1997), 77–87. https://doi.org/10.1145/245108.245126
[29]
Yehuda Koren, Robert Bell, and Chris Volinsky. 2009. Matrix factorization techniques for recommender systems. Computer (2009). https://doi.org/10.1109/MC.2009.263
[30]
Dawen Liang, Rahul G. Krishnan, Matthew D. Hoffman, and Tony Jebara. 2018. Variational autoencoders for collaborative filtering. In The Web Conference 2018 - Proceedings of the World Wide Web Conference, WWW 2018. https://doi.org/10.1145/3178876.3186150 arxiv:1802.05814
[31]
Stuart Lloyd. 1982. Least squares quantization in PCM. IEEE transactions on information theory 28, 2 (1982), 129–137.
[32]
Lien Michiels, Robin Verachtert, and Bart Goethals. 2022. RecPack: An(Other) Experimentation Toolkit for Top-N Recommendation Using Implicit Feedback Data. In Proceedings of the 16th ACM Conference on Recommender Systems (Seattle, WA, USA) (RecSys ’22). Association for Computing Machinery, New York, NY, USA, 648–651. https://doi.org/10.1145/3523227.3551472
[33]
Kasey Moore. 2020. What’s on Netflix: How Long Would It Take To Watch All Of Netflix? What’s on Netflix. https://www.whats-on-netflix.com/news/how-long-would-it-take-to-watch-all-of-netflix/
[34]
Athanasios N Nikolakopoulos, Xia Ning, Christian Desrosiers, and George Karypis. 2022. Trust your neighbors: a comprehensive survey of neighborhood-based methods for recommender systems. Recommender Systems Handbook (2022), 39–89.
[35]
Xia Ning and George Karypis. 2011. Slim: Sparse linear methods for top-n recommender systems. In 2011 IEEE 11th international conference on data mining. IEEE, 497–506.
[36]
Alec Radford, Jong Wook Kim, Chris Hallacy, Aditya Ramesh, Gabriel Goh, Sandhini Agarwal, Girish Sastry, Amanda Askell, Pamela Mishkin, Jack Clark, 2021. Learning transferable visual models from natural language supervision. In International conference on machine learning. PMLR, 8748–8763.
[37]
Steffen Rendle, Walid Krichene, Li Zhang, and Yehuda Koren. 2022. Revisiting the Performance of IALS on Item Recommendation Benchmarks. In Proceedings of the 16th ACM Conference on Recommender Systems (Seattle, WA, USA) (RecSys ’22). Association for Computing Machinery, New York, NY, USA, 427–435. https://doi.org/10.1145/3523227.3548486
[38]
Badrul Sarwar, George Karypis, Joseph Konstan, and John Riedl. 2001. Item-based collaborative filtering recommendation algorithms. In Proceedings of the 10th international conference on World Wide Web. 285–295.
[39]
Vargas Saúl. 2015. Novelty and diversity evaluation and enhancement in recommender systems. PhD thesis. Universidad Autónoma de Madrid, Spain, Madrid.
[40]
Suvash Sedhain, Aditya Krishna Menon, Scott Sanner, and Lexing Xie. 2015. AutoRec. In Proceedings of the 24th International Conference on World Wide Web - WWW ’15 Companion. ACM Press, New York, New York, USA, 111–112. https://doi.org/10.1145/2740908.2742726
[41]
Ilya Shenbin, Anton Alekseev, Elena Tutubalina, Valentin Malykh, and Sergey I. Nikolenko. 2020. RecVAE: A new variational autoencoder for top-n recommendations with implicit feedback. In WSDM 2020 - Proceedings of the 13th International Conference on Web Search and Data Mining. Association for Computing Machinery, Inc, 528–536. https://doi.org/10.1145/3336191.3371831 arxiv:1912.11160
[42]
Nasim Sonboli, Masoud Mansoury, Ziyue Guo, Shreyas Kadekodi, Weiwen Liu, Zijun Liu, Andrew Schwartz, and Robin Burke. 2021-10-26. Librec-auto. In Proceedings of the 30th ACM International Conference on Information & Knowledge Management. ACM, New York, NY, USA, 4584–4593. https://doi.org/10.1145/3459637.3482006
[43]
Harald Steck. 2019. Embarrassingly shallow autoencoders for sparse data. In The Web Conference 2019 - Proceedings of the World Wide Web Conference, WWW 2019. https://doi.org/10.1145/3308558.3313710 arxiv:1905.03375
[44]
Zhu Sun, Di Yu, Hui Fang, Jie Yang, Xinghua Qu, Jie Zhang, and Cong Geng. 2020-09-22. Are We Evaluating Rigorously? Benchmarking Recommendation for Reproducible Evaluation and Fair Comparison. Fourteenth ACM Conference on Recommender Systems (2020-09-22), 23–32. https://doi.org/10.1145/3383313.3412489
[45]
Panagiotis Symeonidis and Andreas Zioupos. 2016. Matrix and Tensor Factorization Techniques for Recommender Systems. https://doi.org/10.1007/978-3-319-41357-0
[46]
Gábor Takács, István Pilászy, and Domonkos Tikk. 2011. Applications of the Conjugate Gradient Method for Implicit Feedback Collaborative Filtering. In Proceedings of the Fifth ACM Conference on Recommender Systems (Chicago, Illinois, USA) (RecSys ’11). Association for Computing Machinery, New York, NY, USA, 297–300. https://doi.org/10.1145/2043932.2043987
[47]
Gábor Takács and Domonkos Tikk. 2012. Alternating Least Squares for Personalized Ranking. In Proceedings of the Sixth ACM Conference on Recommender Systems (Dublin, Ireland) (RecSys ’12). Association for Computing Machinery, New York, NY, USA, 83–90. https://doi.org/10.1145/2365952.2365972
[48]
Jiaxi Tang and Ke Wang. 2018. Personalized Top-N Sequential Recommendation via Convolutional Sequence Embedding. In Proceedings of the Eleventh ACM International Conference on Web Search and Data Mining (Marina Del Rey, CA, USA) (WSDM ’18). Association for Computing Machinery, New York, NY, USA, 565–573. https://doi.org/10.1145/3159652.3159656
[49]
Daniel Valcarce, Alfonso Landin, Javier Parapar, and Álvaro Barreiro. 2019. Collaborative filtering embeddings for memory-based recommender systems. Engineering Applications of Artificial Intelligence 85 (2019), 347–356. https://doi.org/10.1016/j.engappai.2019.06.020
[50]
Vojtěch Vančura and Pavel Kordík. 2021. Deep Variational Autoencoder with Shallow Parallel Path for Top-N Recommendation (VASP). In Artificial Neural Networks and Machine Learning – ICANN 2021, Igor Farkaš, Paolo Masulli, Sebastian Otte, and Stefan Wermter (Eds.). Springer International Publishing, Cham, 138–149.
[51]
Vojtěch Vančura. 2021. Neural Basket Embedding for Sequential Recommendation. In Proceedings of the 15th ACM Conference on Recommender Systems (Amsterdam, Netherlands) (RecSys ’21). Association for Computing Machinery, New York, NY, USA, 878–883. https://doi.org/10.1145/3460231.3473896
[52]
Vojtěch Vančura, Rodrigo Alves, Petr Kasalický, and Pavel Kordík. 2022. Scalable Linear Shallow Autoencoder for Collaborative Filtering. In Proceedings of the 16th ACM Conference on Recommender Systems (Seattle, WA, USA) (RecSys ’22). Association for Computing Machinery, New York, NY, USA, 604–609. https://doi.org/10.1145/3523227.3551482
[53]
Jan Šafařík, Vojtěch Vančura, and Pavel Kordík. 2022. RepSys: Framework for Interactive Evaluation of Recommender Systems. In Proceedings of the 16th ACM Conference on Recommender Systems (Seattle, WA, USA) (RecSys ’22). Association for Computing Machinery, New York, NY, USA, 636–639. https://doi.org/10.1145/3523227.3551469
[54]
Hu Wan, Xuan Sun, Yufei Cui, Chia-Lin Yang, Tei-Wei Kuo, and Chun Jason Xue. 2021. FlashEmbedding: storing embedding tables in SSD for large-scale recommender systems. In Proceedings of the 12th ACM SIGOPS Asia-Pacific Workshop on Systems. 9–16.
[55]
Feng Yu, Qiang Liu, Shu Wu, Liang Wang, and Tieniu Tan. 2016. A Dynamic Recurrent Model for Next Basket Recommendation. In Proceedings of the 39th International ACM SIGIR Conference on Research and Development in Information Retrieval (Pisa, Italy) (SIGIR ’16). Association for Computing Machinery, New York, NY, USA, 729–732. https://doi.org/10.1145/2911451.2914683
[56]
Zygmunt Zajac. 2017. Goodbooks-10k: a new dataset for book recommendations. http://fastml.com/goodbooks-10k. FastML (2017).
[57]
Shuai Zhang, Yi Tay, Lina Yao, and Aixin Sun. 2018. Next item recommendation with self-attention. arXiv preprint arXiv:1808.06414 (2018).
[58]
Shuai Zhang, Lina Yao, Aixin Sun, and Yi Tay. 2019. Deep learning based recommender system: A survey and new perspectives. Comput. Surveys 52, 1 (2019). https://doi.org/10.1145/3285029 arxiv:1707.07435

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
UMAP '23: Proceedings of the 31st ACM Conference on User Modeling, Adaptation and Personalization
June 2023
333 pages
ISBN:9781450399326
DOI:10.1145/3565472
Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the Owner/Author.

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 19 June 2023

Check for updates

Author Tags

  1. Distribution analysis
  2. Implicit feedback recommendation
  3. Linear models
  4. Recommender systems
  5. Shallow autoencoders
  6. User simulation

Qualifiers

  • Extended-abstract
  • Research
  • Refereed limited

Conference

UMAP '23
Sponsor:

Acceptance Rates

Overall Acceptance Rate 162 of 633 submissions, 26%

Upcoming Conference

UMAP '25

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 94
    Total Downloads
  • Downloads (Last 12 months)22
  • Downloads (Last 6 weeks)3
Reflects downloads up to 18 Feb 2025

Other Metrics

Citations

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format.

HTML Format

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media