research-article

Disentangled Self-Supervision in Sequential Recommenders

Authors:

Wenwu ZhuAuthors Info & Claims

KDD '20: Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining

Pages 483 - 491

https://doi.org/10.1145/3394486.3403091

Published: 20 August 2020 Publication History

Abstract

To learn a sequential recommender, the existing methods typically adopt the sequence-to-item (seq2item) training strategy, which supervises a sequence model with a user's next behavior as the label and the user's past behaviors as the input. The seq2item strategy, however, is myopic and usually produces non-diverse recommendation lists. In this paper, we study the problem of mining extra signals for supervision by looking at the longer-term future. There exist two challenges: i) reconstructing a future sequence containing many behaviors is exponentially harder than reconstructing a single next behavior, which can lead to difficulty in convergence, and ii) the sequence of all future behaviors can involve many intentions, not all of which may be predictable from the sequence of earlier behaviors. To address these challenges, we propose a sequence-to-sequence (seq2seq) training strategy based on latent self-supervision and disentanglement. Specifically, we perform self-supervision in the latent space, i.e., reconstructing the representation of the future sequence as a whole, instead of reconstructing the items in the future sequence individually. We also disentangle the intentions behind any given sequence of behaviors and construct seq2seq training samples using only pairs of sub-sequences that involve a shared intention. Results on real-world benchmarks and synthetic data demonstrate the improvement brought by seq2seq training.

References

[1]

Yoshua Bengio, Aaron Courville, and Pascal Vincent. 2013. Representation learning: A review and new perspectives. IEEE transactions on pattern analysis and machine intelligence, Vol. 35, 8 (2013), 1798--1828.

Digital Library

[2]

Diane Bouchacourt, Ryota Tomioka, and Sebastian Nowozin. 2018. Multi-level variational autoencoder: Learning disentangled representations from grouped observations. In Thirty-Second AAAI Conference on Artificial Intelligence.

[3]

Christopher P Burgess, Irina Higgins, Arka Pal, Loic Matthey, Nick Watters, Guillaume Desjardins, and Alexander Lerchner. 2018. Understanding disentangling in $beta $-VAE. arXiv preprint arXiv:1804.03599 (2018).

[4]

Minmin Chen, Alex Beutel, Paul Covington, Sagar Jain, Francois Belletti, and Ed H Chi. 2019. Top-k off-policy correction for a REINFORCE recommender system. In Proceedings of the Twelfth ACM International Conference on Web Search and Data Mining. 456--464.

Digital Library

[5]

Tian Qi Chen, Xuechen Li, Roger B Grosse, and David K Duvenaud. 2018a. Isolating sources of disentanglement in variational autoencoders. In Advances in Neural Information Processing Systems. 2610--2620.

[6]

Xi Chen, Yan Duan, Rein Houthooft, John Schulman, Ilya Sutskever, and Pieter Abbeel. 2016. Infogan: Interpretable representation learning by information maximizing generative adversarial nets. In NIPS 2016.

Digital Library

[7]

Xu Chen, Hongteng Xu, Yongfeng Zhang, Jiaxi Tang, Yixin Cao, Zheng Qin, and Hongyuan Zha. 2018b. Sequential recommendation with user memory networks. In Proceedings of WSDM 2018.

Digital Library

[8]

Paul Covington, Jay Adams, and Emre Sargin. 2016. Deep neural networks for youtube recommendations. In Proceedings of the 10th ACM Conference on Recommender Systems. ACM, 191--198.

Digital Library

[9]

Mukund Deshpande and George Karypis. 2004. Item-based top-n recommendation algorithms. ACM TOIS, Vol. 22, 1 (2004), 143--177.

Digital Library

[10]

Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2018. Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018).

[11]

Nat Dilokthanakul, Pedro AM Mediano, Marta Garnelo, Matthew CH Lee, Hugh Salimbeni, Kai Arulkumaran, and Murray Shanahan. 2016. Deep unsupervised clustering with gaussian mixture variational autoencoders. arXiv preprint arXiv:1611.02648 (2016).

[12]

Emilien Dupont. 2018. Learning disentangled joint continuous and discrete representations. In Advances in Neural Information Processing Systems. 710--720.

[13]

Kaiming He, Haoqi Fan, Yuxin Wu, Saining Xie, and Ross Girshick. 2019. Momentum contrast for unsupervised visual representation learning. arXiv preprint arXiv:1911.05722 (2019).

[14]

Ruining He, Chen Fang, Zhaowen Wang, and Julian McAuley. 2016. Vista: a visually, socially, and temporally-aware model for artistic recommendation. In Proceedings of the 10th ACM Conference on Recommender Systems. 309--316.

Digital Library

[15]

Ruining He, Wang-Cheng Kang, and Julian McAuley. 2017a. Translation-based recommendation. In Proceedings of ACM RecSys 2017.

Digital Library

[16]

Ruining He and Julian McAuley. 2016. Fusing similarity models with markov chains for sparse sequential recommendation. In 2016 IEEE 16th International Conference on Data Mining (ICDM). IEEE, 191--200.

[17]

Xiangnan He, Lizi Liao, Hanwang Zhang, Liqiang Nie, Xia Hu, and Tat-Seng Chua. 2017b. Neural Collaborative Filtering. In Proceedings of WWW 2017.

Digital Library

[18]

Balázs Hidasi and Alexandros Karatzoglou. 2018. Recurrent neural networks with top-k gains for session-based recommendations. In Proceedings of the 27th ACM International Conference on Information and Knowledge Management. 843--852.

Digital Library

[19]

Balázs Hidasi, Alexandros Karatzoglou, Linas Baltrunas, and Domonkos Tikk. 2015. Session-based recommendations with recurrent neural networks. arXiv preprint arXiv:1511.06939 (2015).

[20]

Irina Higgins, Loic Matthey, Arka Pal, Christopher Burgess, Xavier Glorot, Matthew Botvinick, Shakir Mohamed, and Alexander Lerchner. 2017. beta-vae: Learning basic visual concepts with a constrained variational framework. In International Conference on Learning Representations, Vol. 3.

[21]

R Devon Hjelm, Alex Fedorov, Samuel Lavoie-Marchildon, Karan Grewal, Phil Bachman, Adam Trischler, and Yoshua Bengio. 2018. Learning deep representations by mutual information estimation and maximization. arXiv preprint arXiv:1808.06670 (2018).

[22]

Yifan Hu, Yehuda Koren, and Chris Volinsky. 2008. Collaborative filtering for implicit feedback datasets. In 2008 Eighth IEEE International Conference on Data Mining. Ieee, 263--272.

Digital Library

[23]

Jin Huang, Wayne Xin Zhao, Hongjian Dou, Ji-Rong Wen, and Edward Y Chang. 2018. Improving sequential recommendation with knowledge-enhanced memory networks. In SIGIR 2018.

Digital Library

[24]

Zhuxi Jiang, Yin Zheng, Huachun Tan, Bangsheng Tang, and Hanning Zhou. 2017. Variational deep embedding: an unsupervised and generative approach to clustering. In Proceedings of IJCAI 2017.

[25]

Wang-Cheng Kang and Julian McAuley. 2018. Self-attentive sequential recommendation. In ICDM 2018.

[26]

Hyunjik Kim and Andriy Mnih. 2018. Disentangling by Factorising. In International Conference on Machine Learning. 2654--2663.

[27]

Diederik P Kingma and Jimmy Ba. 2014. Adam: A method for stochastic optimization. In International Conference for Learning Representations.

[28]

Diederik P Kingma and Max Welling. 2013. Auto-encoding variational bayes. arXiv preprint arXiv:1312.6114 (2013).

[29]

Yehuda Koren, Robert Bell, Chris Volinsky, et al. 2009. Matrix factorization techniques for recommender systems. Computer, Vol. 42, 8 (2009), 30--37.

Digital Library

[30]

Jing Li, Pengjie Ren, Zhumin Chen, Zhaochun Ren, Tao Lian, and Jun Ma. 2017. Neural attentive session-based recommendation. In Proceedings of the 2017 ACM on Conference on Information and Knowledge Management. 1419--1428.

Digital Library

[31]

Liunian Harold Li, Mark Yatskar, Da Yin, Cho-Jui Hsieh, and Kai-Wei Chang. 2019. Visualbert: A simple and performant baseline for vision and language. arXiv preprint arXiv:1908.03557 (2019).

[32]

Xiaopeng Li and James She. 2017. Collaborative variational autoencoder for recommender systems. In Proceedings of SIGKDD 2017.

Digital Library

[33]

Dawen Liang, Rahul G. Krishnan, Matthew D. Hoffman, and Tony Jebara. 2018. Variational Autoencoders for Collaborative Filtering. In Proceedings of WWW 2018.

Digital Library

[34]

Ninghao Liu, Qiaoyu Tan, Yuening Li, Hongxia Yang, Jingren Zhou, and Xia Hu. 2019. Is a Single Vector Enough? Exploring Node Polysemy for Network Embedding. In Proceedings of SIGKDD 2019.

[35]

Qiang Liu, Shu Wu, Diyi Wang, Zhaokang Li, and Liang Wang. 2016. Context-aware sequential recommendation. In 2016 IEEE 16th International Conference on Data Mining (ICDM). IEEE, 1053--1058.

[36]

Qiao Liu, Yifu Zeng, Refuoe Mokhosi, and Haibin Zhang. 2018. STAMP: short-term attention/memory priority model for session-based recommendation. In Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. 1831--1839.

Digital Library

[37]

Jianxin Ma, Peng Cui, Kun Kuang, Xin Wang, and Wenwu Zhu. 2019 a. Disentangled Graph Convolutional Networks. In Proceedings of the 36th International Conference on Machine Learning (ICML 2019).

[38]

Jianxin Ma, Chang Zhou, Peng Cui, Hongxia Yang, and Wenwu Zhu. 2019 b. Learning disentangled representations for recommendation. In Advances in Neural Information Processing Systems. 5712--5723.

[39]

Ishan Misra, C Lawrence Zitnick, and Martial Hebert. 2016. Shuffle and learn: unsupervised learning using temporal order verification. In European Conference on Computer Vision. Springer, 527--544.

[40]

Mehdi Noroozi and Paolo Favaro. 2016. Unsupervised learning of visual representations by solving jigsaw puzzles. In European Conference on Computer Vision. Springer, 69--84.

[41]

Aaron van den Oord, Yazhe Li, and Oriol Vinyals. 2018. Representation learning with contrastive predictive coding. arXiv preprint arXiv:1807.03748 (2018).

[42]

Steffen Rendle. 2019. Evaluation Metrics for Item Recommendation under Sampling.

[43]

Steffen Rendle, Christoph Freudenthaler, Zeno Gantner, and Lars Schmidt-Thieme. 2009. BPR: Bayesian personalized ranking from implicit feedback. In Proceedings of the twenty-fifth conference on uncertainty in artificial intelligence.

[44]

Steffen Rendle, Christoph Freudenthaler, and Lars Schmidt-Thieme. 2010. Factorizing personalized markov chains for next-basket recommendation. In Proceedings of the 19th international conference on World wide web. 811-820.

Digital Library

[45]

Paul Resnick, Neophytos Iacovou, Mitesh Suchak, Peter Bergstrom, and John Riedl. 1994. GroupLens: an open architecture for collaborative filtering of netnews. In Proceedings of the 1994 ACM conference on Computer supported cooperative work. ACM, 175?186.

Digital Library

[46]

Ruslan Salakhutdinov and Andriy Mnih. 2011. Probabilistic matrix factorization. In NIPS, Vol. 20. 1-8.

[47]

Badrul Sarwar, George Karypis, Joseph Konstan, and John Riedl. 2001. Item-based collaborative filtering recommendation algorithms. In Proceedings of the 10th international conference on World Wide Web. ACM, 285?295.

Digital Library

[48]

Kihyuk Sohn. 2016. Improved deep metric learning with multi-class n-pair loss objective. In Advances in Neural Information Processing Systems. 1857?-1865.

[49]

Fei Sun, Jun Liu, Jian Wu, Changhua Pei, Xiao Lin, Wenwu Ou, and Peng Jiang. 2019. BERT4Rec: Sequential recommendation with bidirectional encoder representations from transformer. In Proceedings of CIKM 2019.

Digital Library

[50]

Jiaxi Tang and Ke Wang. 2018. Personalized top-n sequential recommendation via convolutional sequence embedding. In Proceedings of the Eleventh ACM International Conference on Web Search and Data Mining. 565-573.

Digital Library

[51]

Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, Lukasz Kaiser, and Illia Polosukhin. 2017. Attention is all you need. In Advances in neural information processing systems. 5998-6008.

[52]

Hao Wang, Naiyan Wang, and Dit-Yan Yeung. 2015. Collaborative deep learning for recommender systems. In Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 1235-1244.

Digital Library

[53]

Pengfei Wang, Jiafeng Guo, Yanyan Lan, Jun Xu, Shengxian Wan, and Xueqi Cheng. 2015. Learning hierarchical representation model for next basket recommendation. In Proceedings of SIGIR 2015.

Digital Library

[54]

Chao-Yuan Wu, Amr Ahmed, Alex Beutel, Alexander J Smola, and How Jing. 2017. Recurrent recommender networks. In Proceedings of the tenth ACM international conference on web search and data mining. 495-503.

Digital Library

[55]

Zhirong Wu, Yuanjun Xiong, Stella X Yu, and Dahua Lin. 2018. Unsupervised feature learning via non-parametric instance discrimination. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 3733-3742.

[56]

Feng Yu, Qiang Liu, Shu Wu, Liang Wang, and Tieniu Tan. 2016. A dynamic recurrent model for next basket recommendation. In Proceedings of SIGIR 2016.

Digital Library

[57]

Richard Zhang, Phillip Isola, and Alexei A Efros. 2016. Colorful image colorization. In European conference on computer vision. Springer, 649-666.

[58]

Chang Zhou, Jinze Bai, Junshuai Song, Xiaofei Liu, Zhengchao Zhao, Xiusi Chen, and Jun Gao. 2018. ATRank: An attention-based user behavior modeling frame-work for recommendation. In AAAI 2018.

[59]

Chang Zhou, Jianxin Ma, Jianwei Zhang, Jingren Zhou, and Hongxia Yang. 2020. Contrastive Learning for Debiased Candidate Generation in Large-Scale Recommender Systems. arXiv: arXiv: 2005.12964

Cited By

Chen JZhu ZLi HJiang WJeon GQian Y(2025)A data augmentation model integrating supervised and unsupervised learning for recommendationScientific Reports10.1038/s41598-025-88858-915:1Online publication date: 10-Feb-2025
https://doi.org/10.1038/s41598-025-88858-9
Guo JYin ZFeng SYao DLiu S(2025)Dual intent view contrastive learning for knowledge aware recommender systemsScientific Reports10.1038/s41598-025-86416-x15:1Online publication date: 16-Jan-2025
https://doi.org/10.1038/s41598-025-86416-x
Cheng YZheng JWu BMa Q(2025)Sequential recommendation via agent-based irrelevancy skippingNeural Networks10.1016/j.neunet.2025.107134185(107134)Online publication date: May-2025
https://doi.org/10.1016/j.neunet.2025.107134
Show More Cited By

Index Terms

Disentangled Self-Supervision in Sequential Recommenders
1. Computing methodologies
  1. Machine learning
    1. Learning paradigms
      1. Supervised learning
        Learning to rank
        Ranking
2. Information systems
  1. Information retrieval
    1. Retrieval tasks and goals
      1. Recommender systems
  2. World Wide Web
    1. Web searching and information discovery
      1. Collaborative filtering

Recommendations

Improving Few-Shot Image Classification with Self-supervised Learning
Cloud Computing – CLOUD 2022
Abstract
Few-Shot Image Classification (FSIC) aims to learn an image classifier with only a few training samples. The key challenge of few-shot image classification is to learn this classifier with scarce labeled data. To tackle the issue, we leverage the ...
Self-Supervision Can Be a Good Few-Shot Learner
Computer Vision – ECCV 2022
Abstract
Existing few-shot learning (FSL) methods rely on training with a large labeled dataset, which prevents them from leveraging abundant unlabeled data. From an information-theoretic perspective, we propose an effective unsupervised FSL method, ...
Better Self-training for Image Classification Through Self-supervision
AI 2021: Advances in Artificial Intelligence
Abstract
Self-training is a simple semi-supervised learning approach: Unlabelled examples that attract high-confidence predictions are labelled with their predictions and added to the training set, with this process being repeated multiple times. Recently, ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

KDD '20: Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining

August 2020

3664 pages

ISBN:9781450379984

DOI:10.1145/3394486

General Chairs:
Rajesh Gupta
UC San Diego, USA
,
Yan Liu
USC, USA
,
Program Chairs:
Mohak Shah
LG Electronics, USA
,
Suju Rajan
Linkedin, USA
,
Publications Chairs:
Jiliang Tang
Michigan State, USA
,
B. Aditya Prakash
Georgia Tech, USA

Copyright © 2020 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 20 August 2020

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Conference

KDD '20

Sponsor:

KDD '20: The 26th ACM SIGKDD Conference on Knowledge Discovery and Data Mining

July 6 - 10, 2020

CA, Virtual Event, USA

Acceptance Rates

Overall Acceptance Rate 1,133 of 8,635 submissions, 13%

Upcoming Conference

KDD '25

Sponsor:
sigkdd
sigkdd

The 31st ACM SIGKDD Conference on Knowledge Discovery and Data Mining

August 3 - 7, 2025

Toronto , ON , Canada

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

151
Total Citations
View Citations
2,777
Total Downloads

Downloads (Last 12 months)190
Downloads (Last 6 weeks)14

Reflects downloads up to 16 Feb 2025

Other Metrics

View Author Metrics

Citations

Cited By

Chen JZhu ZLi HJiang WJeon GQian Y(2025)A data augmentation model integrating supervised and unsupervised learning for recommendationScientific Reports10.1038/s41598-025-88858-915:1Online publication date: 10-Feb-2025
https://doi.org/10.1038/s41598-025-88858-9
Guo JYin ZFeng SYao DLiu S(2025)Dual intent view contrastive learning for knowledge aware recommender systemsScientific Reports10.1038/s41598-025-86416-x15:1Online publication date: 16-Jan-2025
https://doi.org/10.1038/s41598-025-86416-x
Cheng YZheng JWu BMa Q(2025)Sequential recommendation via agent-based irrelevancy skippingNeural Networks10.1016/j.neunet.2025.107134185(107134)Online publication date: May-2025
https://doi.org/10.1016/j.neunet.2025.107134
Yang FDu HZhang XYang YWang Y(2025)Self-supervised category-enhanced graph neural networks for recommendationKnowledge-Based Systems10.1016/j.knosys.2025.113109311(113109)Online publication date: Feb-2025
https://doi.org/10.1016/j.knosys.2025.113109
Yang XLi XLiu ZYuan YWang Y(2025)Multi-teacher knowledge distillation for debiasing recommendation with uniform dataExpert Systems with Applications10.1016/j.eswa.2025.126808273(126808)Online publication date: May-2025
https://doi.org/10.1016/j.eswa.2025.126808
Liang SKong QLei YLi C(2025)Graphical contrastive learning for multi-interest sequential recommendationExpert Systems with Applications10.1016/j.eswa.2024.125285259(125285)Online publication date: Jan-2025
https://doi.org/10.1016/j.eswa.2024.125285
Zhang BXu HShuang RWang K(2025)Heterogeneous information-based self-supervised graph learning for recommendationThe Journal of Supercomputing10.1007/s11227-024-06898-w81:4Online publication date: 17-Feb-2025
https://doi.org/10.1007/s11227-024-06898-w
Chen RPang KWang ZLiu QTang CChang YHuang M(2025)A self-supervised graph convolutional model for recommendation with exponential moving averageNeural Computing and Applications10.1007/s00521-024-10933-5Online publication date: 24-Jan-2025
https://doi.org/10.1007/s00521-024-10933-5
Zheng YJin BLi BLai WXiang T(2025)Reducing Interaction Noise for Sequential Recommendation via Robust InterestsDatabase Systems for Advanced Applications10.1007/978-981-97-5555-4_4(51-66)Online publication date: 12-Jan-2025
https://doi.org/10.1007/978-981-97-5555-4_4
Liu YQi JYu Y(2025)A Review on Deep Learning for Sequential Recommender Systems: Key Technologies and DirectionsBig Data10.1007/978-981-96-1024-2_22(305-318)Online publication date: 24-Jan-2025
https://doi.org/10.1007/978-981-96-1024-2_22
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Table of Conten