Skip to main content
Log in

Embarrassingly shallow auto-encoders for dynamic collaborative filtering

  • Published:
User Modeling and User-Adapted Interaction Aims and scope Submit manuscript

Abstract

Recent work has shown that despite their simplicity, item-based models optimised through ridge regression can attain highly competitive results on collaborative filtering tasks. As these models are analytically computable and thus forgo the need for often expensive iterative optimisation procedures, they have become an attractive choice for practitioners. Computing the closed-form ridge regression solution consists of inverting the Gramian item-item matrix, which is known to be a costly operation that scales poorly with the size of the item catalogue. Because of this bottleneck, the adoption of these methods is restricted to a specific set of problems where the number of items is modest. This can become especially problematic in real-world dynamical environments, where the model needs to keep up with incoming data to combat issues of cold start and concept drift. In this work, we propose Dynamic \(\textsc {ease}^{\textsc {r}}\): an algorithm based on the Woodbury matrix identity that incrementally updates an existing regression model when new data arrives, either approximately or exact. By exploiting a widely accepted low-rank assumption for the user-item interaction data, this allows us to target those parts of the resulting model that need updating, and avoid a costly inversion of the entire item-item matrix with every update. We theoretically and empirically show that our newly proposed methods can entail significant efficiency gains in the right settings, broadening the scope of problems for which closed-form models are an appropriate choice.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

Similar content being viewed by others

Notes

  1. It should be noted that the authors have since released a more performant coordinate-descent-based implementation of their method (Ning et al. 2019).

  2. In our experiments, we use an efficient SciPy implementation of a variant called the Implicitly Restarted Lanczos Method (Lehoucq et al. 1998; Virtanen et al. 2020); the analysis is equivalent.

  3. In the SciPy package for Python, an implementation of the randomised method presented by Liberty et al. can be found under scipy.linalg.interpolative.estimate_rank (Liberty et al. 2007; Virtanen et al. 2020).

References

  • Alman, J., Vassilevska, V. W.: A refined laser method and faster matrix multiplication. In Proc. of the Thirty-Second Annual ACM-SIAM Symposium on Discrete Algorithms, SODA ’21. Society for Industrial and Applied Mathematics, (2021)

  • Anyosa, S.C., Vinagre, J., Jorge, A.M.: Incremental matrix co-factorization for recommender systems with implicit feedback. In Companion Proceedings of the The Web Conference 2018, WWW ’18, page 1413–1418. International World Wide Web Conferences Steering Committee, (2018). ISBN 9781450356404

  • Beel, J., Brunel, V.: Data pruning in recommender systems research: Best practice or malpractice? In Proceedings of the 13th ACM Conference on Recommender Systems, RecSys ’19 (2019)

  • Ben-Shimon, D., Tsikinovsky, A., Friedmann, M., Shapira, B., Rokach, L., Hoerle, J.: Recsys challenge 2015 and the yoochoose dataset. In: Proceedings of the 9th ACM Conference on Recommender Systems, RecSys ’15, pp. 357–358. ACM (2015)

  • Bertin-Mahieux, T., Ellis, D.P.W., Whitman, B., Lamere, P.: The million song dataset. In: Proceedings of the 12th International Society for Music Information Retrieval Conference, ISMIR ’11 (2011)

  • Borchers, A., Herlocker, J., Konstan, J., Riedl, J.: Ganging up on information overload. Computer 31(4), 106–108 (1998). ISSN 0018-9162

  • Castells, P., Hurley, N.J., Vargas, S.: Novelty and Diversity in Recommender Systems, pp. 881–918. Springer, US (2015)

  • Chen, B., Liu, Z.: Lifelong machine learning. Synth. Lect. Artif. Intel. Mach. Learn. 12(3), 1–207 (2018)

    Google Scholar 

  • Chen, Y., Wang, Y., Zhao, X., Zou, J., de Rijke, M.: Block-aware item similarity models for top-n recommendation. ACM Trans. Inf. Syst. 38(4), 1–26 (2020)

    Google Scholar 

  • Christakopoulou, E., Karypis, G.: Hoslim: higher-order sparse linear method for top-n recommender systems. In: Advances in Knowledge Discovery and Data Mining, pp. 38–49. Springer, New York (2014)

  • Christakopoulou, E., Karypis, G.: Local item-item models for top-n recommendation. In: Proceedings of the 10th ACM Conference on Recommender Systems, RecSys ’16, pp. 67–74. ACM (2016). ISBN 978-1-4503-4035-9

  • Dacrema, M.F., Cremonesi, P., Jannach, D.: Are we really making much progress? a worrying analysis of recent neural recommendation approaches. In: Proceedings of the 13th ACM Conference on Recommender Systems, RecSys ’19, pp. 101–109. ACM, (2019). ISBN 978-1-4503-6243-6

  • Deshpande, M., Karypis, G.: Item-based top-n recommendation algorithms. ACM Trans. Inf. Syst. 22(1), 143–177 (2004)

    Article  Google Scholar 

  • Ekstrand, M.D., Riedl, J.T., Konstan, J.A.: Collaborative filtering recommender systems. Found. Trends Hum. Comput. Interact. 4(2), 81–173 (2011). ISSN 1551-3955

  • Elahi, E., Wang, W., Ray, D., Fenton, A., Jebara, T.: Variational low rank multinomials for collaborative filtering with side-information. In: Proceedings of the 13th ACM Conference on Recommender Systems, RecSys ’19, pp. 340–347. ACM (2019). ISBN 978-1-4503-6243-6

  • Ferreira, E.J., Enembreck, F., Barddal, J.P.: Adadrift: An adaptive learning technique for long-history stream-based recommender systems. In: Proceedings of the 2020 IEEE International Conference on Systems, Man, and Cybernetics (SMC), pp. 2593–2600 (2020)

  • Gama, I., Žliobaitė, J., Bifet, A., Pechenizkiy, M., Bouchachia, A.: A survey on concept drift adaptation. ACM Comput. Surv. 46(4), (2014). ISSN 0360-0300

  • Gulla, J.A., Zhang, L., Liu, P., Özgöbek, Ö., Su, X.: The adressa dataset for news recommendation. In Proceedings of the International Conference on Web Intelligence, WI ’17, pp. 1042–1048. Association for Computing Machinery, (2017). ISBN 9781450349512

  • Hager, W.W.: Updating the inverse of a matrix. SIAM Rev. 31(2), 221–239 (1989)

    Article  MathSciNet  Google Scholar 

  • Halko, N., Martinsson, P.G., Tropp, J.A.: Finding structure with randomness: probabilistic algorithms for constructing approximate matrix decompositions. SIAM Rev. 53(2), 217–288 (2011). https://doi.org/10.1137/090771806

    Article  MathSciNet  MATH  Google Scholar 

  • Harper, F.M., Konstan, J.A.: The movielens datasets: history and context. ACM Trans. Interact. Intel. Syst. 5(4):19:1–19:19, 19:1-19:19 (2015)

    Google Scholar 

  • He, X., Zhang, H., Kan, M., Chua, T.: Fast matrix factorization for online recommendation with implicit feedback. In: Proceedings of the 39th International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR ’16, pp. 549–558. ACM (2016)

  • Jeunen, O.: Revisiting offline evaluation for implicit-feedback recommender systems. In: Proceedings of the 13th ACM Conference on Recommender Systems, RecSys ’19, pp. 596–600. ACM, (2019). ISBN 978-1-4503-6243-6

  • Jeunen, O., Goethals, B.: Pessimistic reward models for off-policy learning in recommendation. In: Proceedings of the 15th ACM Conference on Recommender Systems, RecSys ’21 (2021)

  • Jeunen, O., Verstrepen, K., Goethals, B.: Fair offline evaluation methodologies for implicit-feedback recommender systems with mnar data. In: Proceedings of the REVEAL 18 Workshop on Offline Evaluation for Recommender Systems (RecSys ’18), October (2018)

  • Jeunen, O., Verstrepen, K., Goethals, B.: Efficient similarity computation for collaborative filtering in dynamic environments. In: Proceedings of the 13th ACM Conference on Recommender Systems, RecSys ’19, pp. D251–259. ACM, (2019). ISBN 978-1-4503-6243-6

  • Jeunen, O., Van Balen, J., Goethals, B.: Closed-form models for collaborative filtering with side-information. In: Fourteenth ACM Conference on Recommender Systems, RecSys ’20, pp. 651–656, (2020). ISBN 9781450375832

  • Kabbur, S., Ning, X., Karypis, G.: Fism: Factored item similarity models for top-n recommender systems. In: Proceedings of the 19th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD ’13, pp. 659–667, (2013). ISBN 9781450321747

  • Kaggle. RetailRocket Recommender System Dataset, 2016. URL https://www.kaggle.com/retailrocket/ecommerce-dataset

  • Khawar, F., Poon, L., Zhang, N.L.: Learning the structure of auto-encoding recommenders. In: Proceedigns of The Web Conference 2020, WWW ’20, pp. 519–529. ACM (2020)

  • Koren, Y.: Factorization meets the neighborhood: a multifaceted collaborative filtering model. In: Proceedings of the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD ’08, pp. 426–434. Association for Computing Machinery, (2008). ISBN 9781605581934

  • Koren, Y., Bell, R., Volinsky, C.: Matrix factorization techniques for recommender systems. Computer, 42(8):30–37, (2009). ISSN 0018-9162

  • Lanczos, C.: An iteration method for the solution of the eigenvalue problem of linear differential and integral operators. J. Res. Natl. Bur. Stand. 45, 255–282 (1950)

    Article  MathSciNet  Google Scholar 

  • Le Gall, F.: Powers of tensors and fast matrix multiplication. In: Proceedings of the 39th International Symposium on Symbolic and Algebraic Computation, ISSAC ’14, pp. 296–303. Association for Computing Machinery (2014)

  • Lehoucq, R.B., Sorensen, D. C., Yang, C.: ARPACK users’ guide: solution of large-scale eigenvalue problems with implicitly restarted Arnoldi methods. SIAM (1998)

  • Levy, M., Jack, K.: Efficient top-n recommendation by linear regression. In: RecSys Large Scale Recommender Systems Workshop (2013)

  • Levy, O., Goldberg, Y.: Neural word embedding as implicit matrix factorization. Adv. Neural Inform. Process. Syst. 27, 2177–2185 (2014)

    Google Scholar 

  • Liang, D., Krishnan, R.G., Hoffman, M.D., Jebara, T.: Variational autoencoders for collaborative filtering. In: Proceedings of the 2018 World Wide Web Conference, WWW ’18, pp. 689–698. International World Wide Web Conferences Steering Committee, ACM (2018)

  • Liberty, E., Woolfe, F., Martinsson, P., Rokhlin, V., Tygert, M.: Randomized algorithms for the low-rank approximation of matrices. Proc. Natl. Acad. Sci. 104(51), 20167–20172 (2007)

    Article  MathSciNet  Google Scholar 

  • Ludewig, M., Jannach, D.: Evaluation of session-based recommendation algorithms. User Model. User-Adap. Inter. 28(4), 331–390 (2018)

    Article  Google Scholar 

  • Martinsson, P.G., Rokhlin, V., Tygert, M.: A randomized algorithm for the decomposition of matrices. Appl. Comput. Harmon. Anal. 30(1), 47–68 (2011). ISSN 1063-5203

  • Matuszyk, P., Vinagre, J., Spiliopoulou, M., Jorge, A.M., Gama, J.: Forgetting techniques for stream-based matrix factorization in recommender systems. Knowl. Inf. Syst. 55(2), 275–304 (2018). https://doi.org/10.1007/s10115-017-1091-8

    Article  Google Scholar 

  • Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Adv. Neural Inform. Process. Syst. 26, 3111–3119 (2013)

    Google Scholar 

  • Ning, X., Karypis, G.: Slim: Sparse linear methods for top-n recommender systems. In: Proceedings of the 2011 IEEE 11th International Conference on Data Mining, ICDM ’11, pp. 497–506. IEEE Computer Society (2011). ISBN 978-0-7695-4408-3

  • Ning, X., Karypis, G.: Sparse linear methods with side information for top-n recommendations. In: Proceedings of the 6th ACM Conference on Recommender Systems, RecSys ’12, pp. 155–162, (2012). ISBN 9781450312707

  • Ning, X., Nikolakopoulos, A.N., Shui, Z., Sharma,M., Karypis, G.: SLIM Library for Recommender Systems, (2019). URL https://github.com/KarypisLab/SLIM

  • Paige, C.C.: Accuracy and effectiveness of the lanczos algorithm for the symmetric eigenproblem. Linear Algebra Appl. 34, 235–258 (1980)

    Article  MathSciNet  Google Scholar 

  • Pan, V.Y., Chen, Z.Q.: The complexity of the matrix eigenproblem. In: Proceedings of the 31st Annual ACM Symposium on Theory of Computing, STOC ’99, pp. 507–516. Association for Computing Machinery, (1999)

  • Park, Y., Tuzhilin, A.: The long tail of recommender systems and how to leverage it. In: Proceedings of the 2008 ACM Conference on Recommender Systems, RecSys ’08, pp. 11–18, (2008). URL https://doi.org/10.1145/1454008.1454012

  • Rendle, S.: Evaluation metrics for item recommendation under sampling (2019)

  • Rendle, S., Freudenthaler, C., Gantner, Z., Schmidt-Thieme, L.: BPR: Bayesian personalized ranking from implicit feedback. In: Proceedings of the 25th Conference on Uncertainty in Artificial Intelligence, UAI ’09, pp. 452–461. AUAI Press (2009)

  • Rendle, S., Krichene, W., Zhang, L., Anderson, J.: Neural collaborative filtering vs. matrix factorization revisited. In: Fourteenth ACM Conference on Recommender Systems, RecSys ’20, pp. 240–248, (2020). ISBN 9781450375832

  • Sarwar, B., Karypis, G., Konstan, J., Riedl, J.: Item-based collaborative filtering recommendation algorithms. In: Proceedings of the 10th International Conference on World Wide Web, WWW ’01, pp. 285–295. ACM (2001)

  • Schein, A.I., Popescul, A., Ungar, L.H., Pennock, D.M.: Methods and metrics for cold-start recommendations. In: Proceedings of the 25th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR ’02, pp. 253–260. ACM (2002)

  • Sedhain, S., Menon, A.K., Sanner, S., Braziunas, D.: On the effectiveness of linear models for one-class collaborative filtering. In: Proceedings of the 30th AAAI Conference on Artificial Intelligence, AAAI’16, pp. 229–235 (2016)

  • Shenbin, I., Alekseev, A., Tutubalina, E., Malykh, V., Nikolenko, S.I.: Recvae: A new variational autoencoder for top-n recommendations with implicit feedback. In: Proceedings of the 13th International Conference on Web Search and Data Mining, WSDM ’20, pp. 528–536, (2020)

  • Shi, Y., Larson, M., Hanjalic, A.: Collaborative filtering beyond the user-item matrix: a survey of the state of the art and future challenges. ACM Comput. Surv. 47(1) (2014)

  • Steck, H.: Embarrassingly shallow autoencoders for sparse data. In: The World Wide Web Conference, WWW ’19, pp. 3251–3257 (2019)

  • Steck, H.: Collaborative filtering via high-dimensional regression. CoRR, abs/1904.13033 (2019)

  • Steck, H.: Markov random fields for collaborative filtering. Adv. Neural Inform. Process. Syst. 32, 5473–5484 (2019)

    Google Scholar 

  • Steck, H., Dimakopoulou, M., Riabov, N., Jebara, T.: Admm slim: Sparse recommendations for many users. In: Proceedings of the 13th International Conference on Web Search and Data Mining, WSDM ’20, pp. 555–563 (2020)

  • Ubaru, S., Saad, Y.: Fast methods for estimating the numerical rank of large matrices. In: Proceedings of the 33rd International Conference on Machine Learning, Vol. 48, ICML’16, pp. 468–477. JMLR.org (2016)

  • Van Balen, J., Goethals, B.: High-dimensional sparse embeddings for collaborative filtering. In: Proceedings of the Web Conference 2021, WWW ’21, pp. 575–581. ACM (2021)

  • Verstrepen, K., Bhaduriy, K., Cule, B., Goethals, B.: Collaborative filtering for binary, positiveonly data. SIGKDD Explor. Newsl., 19(1):1–21, (2017). ISSN 1931-0145

  • Vinagre, J., Jorge, A.M., Gama, J.: Fast incremental matrix factorization for recommendation with positive-only feedback. In: User Modeling, Adaptation, and Personalization, pp. 459–470. Springer, Cham (2014)

  • Vinagre, J., Jorge, A.M., Gama, J.: Collaborative filtering with recency-based negative feedback. In: Proceedings of the 30th Annual ACM Symposium on Applied Computing, SAC ’15, pp. 963–965, (2015). ISBN 9781450331968

  • Vinagre, J., Jorge, A.M., Al-Ghossein, M., Bifet, A.: ORSUM - Workshop on Online Recommender Systems and User Modeling, pp. 619–620. ACM, (2020). ISBN 9781450375832

  • Viniski, A.D., Barddal, J.P., de Souza Britto Jr., A., Enembreck,F., de Campos, H.V.A.: A case study of batch and incremental recommender systems in supermarket data under concept drifts and cold start. Expert Systems with Applications, 176:114890, 2021. ISSN 0957-4174

  • Virtanen, P., Gommers, R., Oliphant, T.E., et al.: Scipy 1.0: fundamental algorithms for scientific computing in python. Nat. Methods 17(3), 261–272 (2020)

    Article  Google Scholar 

  • Wu, F., Qiao, Y., Chen, J., Wu, C., Qi, T., Lian, J., Liu, D., Xie, X., Gao, J., Wu, W., Zhou, M.: MIND: A large-scale dataset for news recommendation. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, ACL’20, pp. 3597–3606. Association for Computational Linguistics, (2020)

Download references

Acknowledgements

This work received funding from the Flemish Government (AI Research Programme).

Author information

Authors and Affiliations

Authors

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Jan Van Balen: work done while the author was at Adrem Data Lab.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Jeunen, O., Van Balen, J. & Goethals, B. Embarrassingly shallow auto-encoders for dynamic collaborative filtering. User Model User-Adap Inter 32, 509–541 (2022). https://doi.org/10.1007/s11257-021-09314-7

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11257-021-09314-7

Keywords

Navigation