short-paper

Progressive Self-Attention Network with Unsymmetrical Positional Encoding for Sequential Recommendation

Authors:

Wenliang ZhongAuthors Info & Claims

SIGIR '22: Proceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval

Pages 2029 - 2033

https://doi.org/10.1145/3477495.3531800

Published: 07 July 2022 Publication History

Get Access

Abstract

In real-world recommendation systems, the preferences of users are often affected by long-term constant interests and short-term temporal needs. The recently proposed Transformer-based models have proved superior in the sequential recommendation, modeling temporal dynamics globally via the remarkable self-attention mechanism. However, all equivalent item-item interactions in original self-attention are cumbersome, failing to capture the drifting of users' local preferences, which contain abundant short-term patterns. In this paper, we propose a novel interpretable convolutional self-attention, which efficiently captures both short- and long-term patterns with a progressive attention distribution. Specifically, a down-sampling convolution module is proposed to segment the overall long behavior sequence into a series of local subsequences. Accordingly, the segments are interacted with each item in the self-attention layer to produce locality-aware contextual representations, during which the quadratic complexity in original self-attention is reduced to nearly linear complexity. Moreover, to further enhance the robust feature learning in the context of Transformers, an unsymmetrical positional encoding strategy is carefully designed. Extensive experiments are carried out on real-world datasets, \eg ML-1M, Amazon Books, and Yelp, indicating that the proposed method outperforms the state-of-the-art methods w.r.t. both effectiveness and efficiency.

Supplementary Material

MP4 File (SIGIR22-sp1437.mp4)

This video summarizes the motivation of our approach, the details of the method, and the extensive experimental justification. Concretely, the core idea of this work is to capture the sequential patterns from both a global and local perspective, through a new self-attention proposed by us. In practice, there are two improvements in our proposed self-attention, compared with the traditional self-attention. First, we insert two convolution modules after computing the key and value. Second, we adopt another branch to encode the position information of items. In our experiments, many ablation/ comparison experiments demonstrate the effectiveness of our approach. Specifically, the visualizations of attention weights show that our method not only extracts the vital recent patterns progressively but also captures the useful underlying item-item dependences with the unsymmetrical positional attention weights.

Download
13.67 MB

References

[1]

Francois Belletti, Minmin Chen, and Ed H Chi. 2019. Quantifying long range dependence in language and user behavior to improve RNNs. In KDD . 1317--1327.

Abstract

Supplementary Material

References

Cited By

Index Terms

Recommendations

Sequential Recommendation via Stochastic Self-Attention

HSA: Hyperbolic Self-attention for Sequential Recommendation

Dynamic time-aware collaborative sequential recommendation with attention-based network

Comments

Information

Published In

Sponsors

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Funding Sources

Conference

Acceptance Rates

Contributors

Other Metrics

Bibliometrics

Article Metrics

Other Metrics

Citations

Cited By

Login options

Full Access

View options

PDF

eReader

Share

Share this Publication link

Share on social media

Affiliations