skip to main content
10.1145/3477495.3531800acmconferencesArticle/Chapter ViewAbstractPublication PagesirConference Proceedingsconference-collections
short-paper

Progressive Self-Attention Network with Unsymmetrical Positional Encoding for Sequential Recommendation

Published: 07 July 2022 Publication History

Abstract

In real-world recommendation systems, the preferences of users are often affected by long-term constant interests and short-term temporal needs. The recently proposed Transformer-based models have proved superior in the sequential recommendation, modeling temporal dynamics globally via the remarkable self-attention mechanism. However, all equivalent item-item interactions in original self-attention are cumbersome, failing to capture the drifting of users' local preferences, which contain abundant short-term patterns. In this paper, we propose a novel interpretable convolutional self-attention, which efficiently captures both short- and long-term patterns with a progressive attention distribution. Specifically, a down-sampling convolution module is proposed to segment the overall long behavior sequence into a series of local subsequences. Accordingly, the segments are interacted with each item in the self-attention layer to produce locality-aware contextual representations, during which the quadratic complexity in original self-attention is reduced to nearly linear complexity. Moreover, to further enhance the robust feature learning in the context of Transformers, an unsymmetrical positional encoding strategy is carefully designed. Extensive experiments are carried out on real-world datasets, \eg ML-1M, Amazon Books, and Yelp, indicating that the proposed method outperforms the state-of-the-art methods w.r.t. both effectiveness and efficiency.

Supplementary Material

MP4 File (SIGIR22-sp1437.mp4)
This video summarizes the motivation of our approach, the details of the method, and the extensive experimental justification. Concretely, the core idea of this work is to capture the sequential patterns from both a global and local perspective, through a new self-attention proposed by us. In practice, there are two improvements in our proposed self-attention, compared with the traditional self-attention. First, we insert two convolution modules after computing the key and value. Second, we adopt another branch to encode the position information of items. In our experiments, many ablation/ comparison experiments demonstrate the effectiveness of our approach. Specifically, the visualizations of attention weights show that our method not only extracts the vital recent patterns progressively but also captures the useful underlying item-item dependences with the unsymmetrical positional attention weights.

References

[1]
Francois Belletti, Minmin Chen, and Ed H Chi. 2019. Quantifying long range dependence in language and user behavior to improve RNNs. In KDD . 1317--1327.
[2]
Krzysztof Choromanski, Valerii Likhosherstov, David Dohan, Xingyou Song, Andreea Gane, Tamas Sarlos, Peter Hawkins, Jared Davis, Afroz Mohiuddin, Lukasz Kaiser, et almbox. 2021. Rethinking attention with performers. ICLR (2021).
[3]
Stéphane d'Ascoli, Hugo Touvron, Matthew Leavitt, Ari Morcos, Giulio Biroli, and Levent Sagun. 2021. Convit: Improving vision transformers with soft convolutional inductive biases. arXiv preprint arXiv:2103.10697 (2021).
[4]
Robin Devooght and Hugues Bersini. 2017. Long and short-term recommendations with recurrent neural networks. In UMAP . 13--21.
[5]
Xinyan Fan, Zheng Liu, Jianxun Lian, Wayne Xin Zhao, Xing Xie, and Ji-Rong Wen. 2021. Lighter and Better: Low-Rank Decomposed Self-Attention Networks for Next-Item Recommendation. In SIGIR. 1733--1737.
[6]
Balázs Hidasi, Alexandros Karatzoglou, Linas Baltrunas, and Domonkos Tikk. 2016. Session-based recommendations with recurrent neural networks. ICLR (2016).
[7]
Linmei Hu, Chen Li, Chuan Shi, Cheng Yang, and Chao Shao. 2020. Graph neural news recommendation with long-term and short-term interest modeling. Inf Process Manag, Vol. 57, 2 (2020), 102142.
[8]
Wang-Cheng Kang and Julian McAuley. 2018. Self-attentive sequential recommendation. In ICDM. IEEE, 197--206.
[9]
Angelos Katharopoulos, Apoorv Vyas, Nikolaos Pappas, and Francc ois Fleuret. 2020. Transformers are rnns: Fast autoregressive transformers with linear attention. In ICML . PMLR, 5156--5165.
[10]
Guolin Ke, Di He, and Tie-Yan Liu. 2021. Rethinking positional encoding in language pre-training. ICLR (2021).
[11]
Diederik P Kingma and Jimmy Ba. 2014. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014).
[12]
Chao Li, Zhiyuan Liu, Mengmeng Wu, Yuchi Xu, Huan Zhao, Pipei Huang, Guoliang Kang, Qiwei Chen, Wei Li, and Dik Lun Lee. 2019. Multi-interest network with dynamic routing for recommendation at Tmall. In ICKM . 2615--2623.
[13]
Jing Li, Pengjie Ren, Zhumin Chen, Zhaochun Ren, Tao Lian, and Jun Ma. 2017. Neural attentive session-based recommendation. In CIKM. 1419--1428.
[14]
Kan Ren, Jiarui Qin, Yuchen Fang, Weinan Zhang, Lei Zheng, Weijie Bian, Guorui Zhou, Jian Xu, Yong Yu, Xiaoqiang Zhu, et almbox. 2019. Lifelong sequential modeling with personalized memorization for user response prediction. In SIGIR. 565--574.
[15]
Steffen Rendle et almbox. 2010. Factorizing personalized markov chains for next-basket recommendation. In WWW . 811--820.
[16]
Fei Sun, Jun Liu, Jian Wu, Changhua Pei, Xiao Lin, Wenwu Ou, and Peng Jiang. 2019. BERT4Rec: Sequential recommendation with bidirectional encoder representations from transformer. In CIKM. 1441--1450.
[17]
Yi Tay, Dara Bahri, Donald Metzler, Da-Cheng Juan, Zhe Zhao, and Che Zheng. 2021. Synthesizer: Rethinking Self-Attention for Transformer Models. In ICML . 10183--10192.
[18]
Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, Łukasz Kaiser, and Illia Polosukhin. 2017. Attention is all you need. In NeurIPS. 5998--6008.
[19]
Sinong Wang, Belinda Z Li, Madian Khabsa, Han Fang, and Hao Ma. 2020. Linformer: Self-attention with linear complexity. arXiv preprint arXiv:2006.04768 (2020).
[20]
Liwei Wu, Shuqing Li, Cho-Jui Hsieh, and James Sharpnack. 2020. SSE-PT: Sequential recommendation via personalized transformer. In RecSys . 328--337.
[21]
Xu Xie, Fei Sun, Zhaoyang Liu, Shiwen Wu, Jinyang Gao, Bolin Ding, and Bin Cui. 2021. Contrastive Learning for Sequential Recommendation. SIGIR (2021).
[22]
Shuai Zhang, Yi Tay, Lina Yao, and Aixin Sun. 2018. Next item recommendation with self-attention. arXiv preprint arXiv:1808.06414 (2018).
[23]
Wayne Xin Zhao, Shanlei Mu, Yupeng Hou, Zihan Lin, Yushuo Chen, Xingyu Pan, Kaiyuan Li, Yujie Lu, Hui Wang, Changxin Tian, et almbox. 2021. Recbole: Towards a unified, comprehensive and efficient framework for recommendation algorithms. In CIKM. 4653--4664.

Cited By

View all
  • (2025)Locally enhanced denoising self-attention networks and decoupled position encoding for sequential recommendationComputers and Electrical Engineering10.1016/j.compeleceng.2025.110064123(110064)Online publication date: Apr-2025
  • (2024)Enhancing Sequential Recommenders with Augmented Knowledge from Aligned Large Language ModelsProceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/3626772.3657782(345-354)Online publication date: 10-Jul-2024
  • (2024)A global contextual enhanced structural-aware transformer for sequential recommendationKnowledge-Based Systems10.1016/j.knosys.2024.112515304(112515)Online publication date: Nov-2024
  • Show More Cited By

Index Terms

  1. Progressive Self-Attention Network with Unsymmetrical Positional Encoding for Sequential Recommendation

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    SIGIR '22: Proceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval
    July 2022
    3569 pages
    ISBN:9781450387323
    DOI:10.1145/3477495
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 07 July 2022

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. interpretability
    2. progressive self-attention
    3. sequential recommendation

    Qualifiers

    • Short-paper

    Funding Sources

    • National Natural Science Foundation of China
    • Key Research and Development Program of Shaanxi

    Conference

    SIGIR '22
    Sponsor:

    Acceptance Rates

    Overall Acceptance Rate 792 of 3,983 submissions, 20%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)58
    • Downloads (Last 6 weeks)16
    Reflects downloads up to 28 Feb 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2025)Locally enhanced denoising self-attention networks and decoupled position encoding for sequential recommendationComputers and Electrical Engineering10.1016/j.compeleceng.2025.110064123(110064)Online publication date: Apr-2025
    • (2024)Enhancing Sequential Recommenders with Augmented Knowledge from Aligned Large Language ModelsProceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/3626772.3657782(345-354)Online publication date: 10-Jul-2024
    • (2024)A global contextual enhanced structural-aware transformer for sequential recommendationKnowledge-Based Systems10.1016/j.knosys.2024.112515304(112515)Online publication date: Nov-2024
    • (2024)Enhanced side information fusion framework for sequential recommendationInternational Journal of Machine Learning and Cybernetics10.1007/s13042-024-02328-816:2(1157-1173)Online publication date: 10-Sep-2024
    • (2023)GreenSeq: Automatic Design of Green Networks for Sequential Recommendation SystemsProceedings of the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/3539618.3591855(3364-3368)Online publication date: 19-Jul-2023
    • (2023)Enhancing sequential recommendation with contrastive Generative Adversarial NetworkInformation Processing and Management: an International Journal10.1016/j.ipm.2023.10333160:3Online publication date: 1-May-2023
    • (2023)End-to-End Optimization of Quantization-Based Structure Learning and Interventional Next-Item RecommendationArtificial Intelligence10.1007/978-981-99-8850-1_34(415-429)Online publication date: 22-Jul-2023

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media