Abstract
Sequential recommendation is a recommendation problem that aims to predict the next item in the sequence of user-item interactions. Sequential recommendation is similar to language modelling in terms of learning sequence structure; therefore, variants of the Transformer architecture, which has recently become mainstream in language modelling, also achieved state-of-the-art performance in sequential recommendation. However, despite similarities, training Transformers for recommendation models may be tricky: most recommendation datasets have their unique item sets, and therefore, the pre-training/finetuning approach, which is very successful for training language models, has limited applications for recommendations. Moreover, a typical recommender system has to work with millions of items, much larger than the vocabulary size of language models. In this tutorial, we cover adaptations of Transformers for sequential recommendation and techniques that help to mitigate the training challenges. The half-day (3 h + a break) tutorial consists of two sessions. The first session provides a background of the Transformer architecture and its adaptations to Recommendation scenarios. It covers classic Transformer-based models, such as SASRec and BERT4Rec, their architectures, training tasks and loss functions. In this session, we also discuss the specifics of training these models with large datasets and discuss negative sampling and the mitigation problem of the overconfidence problem caused by negative sampling. We also discuss the problem of the large item embedding tensor and the approaches to mitigate this problem, allowing training of the models even with very large item catalogues. In the second part of the tutorial, we focus specifically on modern generative transformer-based models for sequential recommendation. We discuss specifics of generative models for sequential recommending, such as item ID representation and recommendation list generation strategies. We also cover modern adaptations of large language models (LLMs) to recommender systems and discuss concrete examples, such as the P5 model. We conclude the session with our vision for the future development of the recommender systems field in the era of Large Language Models.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Brown, T., et al.: Language models are few-shot learners. In: Proceedings of NeurIPS, vol. 33, pp. 1877–1901 (2020)
Chen, H., et al.: Denoising self-attentive sequential recommendation. In: Proceedings of the 16th ACM Conference on Recommender Systems, pp. 92–101 (2022)
Chen, X., et al.: Sequential recommendation with user memory networks. In: Proceedings of WSDM, pp. 108–116 (2018)
Devlin, J., Chang, M.W., Lee, K., Toutanova, K.: BERT: pre-training of deep bidirectional transformers for language understanding. In: Proceedings of NAACL-HLT, pp. 4171–4186 (2019)
Gunawardana, A., Shani, G., Yogev, S.: Evaluating recommender systems. In: Ricci, F., Rokach, L., Shapira, B. (eds.) Recommender Systems Handbook, pp. 547–601. Springer, New York (2022). https://doi.org/10.1007/978-1-0716-2197-4_15
Hidasi, B., Karatzoglou, A.: Recurrent neural networks with top-k gains for session-based recommendations. In: Proceedings of the CIKM, pp. 843–852 (2018)
Hidasi, B., Karatzoglou, A., Baltrunas, L., Tikk, D.: Session-based recommendations with recurrent neural networks. In: Proceedings of the ICLR (2016)
Huang, J., Zhao, W.X., Dou, H., Wen, J.R., Chang, E.Y.: Improving sequential recommendation with knowledge-enhanced memory networks. In: Proceedings of the SIGIR, pp. 505–514 (2018)
Kang, W.C., McAuley, J.: Self-attentive sequential recommendation. In: Proceedings of the ICDM, pp. 197–206 (2018)
Krichene, W., Rendle, S.: On sampled metrics for item recommendation. Commun. ACM 65(7), 75–83 (2022)
Meng, Z., McCreadie, R., Macdonald, C., Ounis, I.: Exploring data splitting strategies for the evaluation of recommendation models. In: Proceedings of the RecSys, pp. 681–686 (2020)
Petrov, A.V., Macdonald, C.: Generative sequential recommendation with GPTRec. In: Proceedings of the Gen-IR@SIGIR (2023)
Petrov, A.V., Macdonald, C.: Effective and efficient training for sequential recommendation using recency sampling. In: Proceedings of the RecSys, pp. 81–91 (2022)
Petrov, A.V., Macdonald, C.: A Systematic Review and Replicability Study of BERT4Rec for Sequential Recommendation. In: Proc. RecSys. pp. 436–447 (2022)
Petrov, A.V., Macdonald, C.: gSASRec: reducing overconfidence in sequential recommendation trained with negative sampling. In: Proceedings of the RecSys, pp. 116–128 (2023)
Petrov, A.V., Macdonald, C.: RecJPQ: training large-catalogue sequential recommenders. In: Proceedings of the WSDM (2024)
Pradeep, R., et al.: How does generative retrieval scale to millions of passages? In: Proceedings of the Gen-IR@SIGIR (2023)
Radford, A., Narasimhan, K., Salimans, T., Sutskever, I.: Improving language understanding by generative pre-training (2018)
Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019)
Rajput, S., et al.: Recommender systems with generative retrieval. In: Proceedings of the NeurIPS (2023)
Rendle, S., Freudenthaler, C., Schmidt-Thieme, L.: Factorizing personalized Markov chains for next-basket recommendation. In: Proceedings of the WWW, p. 811 (2010)
Shi, H.J.M., Mudigere, D., Naumov, M., Yang, J.: Compositional embeddings using complementary partitions for memory-efficient recommendation systems. In: Proceedings of the KDD, pp. 165–175 (2020)
Sun, F., et al.: BERT4Rec: sequential recommendation with bidirectional encoder representations from transformer. In: Proceedings of the CIKM, pp. 1441–1450 (2019)
Sun, W., et al.: Learning to tokenize for generative retrieval. In: Proceedings of the NeurIPS (2023)
Tang, J., Wang, K.: Personalized top-N sequential recommendation via convolutional sequence embedding. In: Proceedings of the WSDM, pp. 565–573 (2018)
Tay, Y., et al.: Transformer memory as a differentiable search index (2022)
Wu, L., Li, S., Hsieh, C.J., Sharpnack, J.: SSE-PT: sequential recommendation via personalized transformer. In: Proceedings of the RecSys, pp. 328–337 (2020)
Yuan, Z., et al.: Where to go next for recommender systems? ID- vs. modality-based recommender models revisited (2023)
Zhou, K., et al.: S3-Rec: self-supervised learning for sequential recommendation with mutual information maximization. In: Proceedings of the CIKM, pp. 1893–1902 (2020)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2024 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Petrov, A.V., Macdonald, C. (2024). Transformers for Sequential Recommendation. In: Goharian, N., et al. Advances in Information Retrieval. ECIR 2024. Lecture Notes in Computer Science, vol 14612. Springer, Cham. https://doi.org/10.1007/978-3-031-56069-9_49
Download citation
DOI: https://doi.org/10.1007/978-3-031-56069-9_49
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-56068-2
Online ISBN: 978-3-031-56069-9
eBook Packages: Computer ScienceComputer Science (R0)