Skip to main content
Log in

Towards more effective encoders in pre-training for sequential recommendation

  • Published:
World Wide Web Aims and scope Submit manuscript

Abstract

Pre-training emerges as a new learning paradigm in natural language processing and computer vision. It has also been introduced into sequential recommendation in several seminal studies for alleviating data sparsity issue. However, existing methods adopt the bidirectional transformer as the encoder which suffers from two drawbacks. One is insufficient intention modeling since the transformer architecture is suitable for extracting distributed consumption intention but cannot well catch users’ concentrated and occasion consumption intentions. The other is information leakage caused by foreseeing the future item in advance during the bidirectional encoding process. To address these problems, we propose to construct more effective encoders in pre-training for sequential recommendation. Specifically, we first decouple the original bidirectional process in transformer structure into two unidirectional processes which can avoid the information leakage problem and capture the distributed consumption intention. We then employ the locality-aware convolutional neural networks (CNNs) with narrow receptive field for concentrated consumption modeling. We also introduce a random shuffle strategy to empower CNN with the ability of modeling the occasion consumption. Experiments on five datasets demonstrate that our method improves the performance of various types of downstream sequential recommendation models to a large extent, and it also generates the overall better performance than the state-of-the-art self-supervised pre-training methods.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9

Similar content being viewed by others

Notes

  1. On the Amazon and LastFM datasets used in our experiments, we do not observe sufficient periodical repeat item records in Amazon, and most repeat music records of LastFM are in a loop style. Though the periodical repeat category records can be observed in both datasets, modeling them requires extra information and is unfair to other baseline methods along this line.

References

  1. Rendle, S., Freudenthaler, C., Schmidt-Thieme, L.: Factorizing personalized markov chains for next-basket recommendation. In: Proceedings of the 19th International Conference on World Wide Web, pp. 811–820 (2010)

  2. He, R., McAuley, J.: Fusing similarity models with markov chains for sparse sequential recommendation. In: Proceedings of the 2016 IEEE 16th International Conference on Data Mining, pp. 191–200 (2016)

  3. Hidasi, B., Karatzoglou, A., Baltrunas, L., Tikk, D.: Session-based recommendations with recurrent neural networks. arXiv:1511.06939 (2015)

  4. Yu, F., Liu, Q., Wu, S., Wang, L., Tan, T.: A dynamic recurrent model for next basket recommendation. In: Proceedings of the 39th International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 729–732 (2016)

  5. Sun, K., Qian, T., Yin, H., Chen, T., Chen, Y., Chen, L.: What can history tell us? In: Proceedings of the 28th ACM International Conference on Information and Knowledge Management, pp. 1593–1602 (2019)

  6. Sun, K., Qian, T., Chen, T., Liang, Y., Nguyen, Q.V.H., Yin, H.: Where to go next: Modeling long-and short-term user preferences for point-of-interest recommendation. In: Proceedings of the 34th AAAI Conference on Artificial Intelligence, pp. 214–221 (2020)

  7. Li, J., Ren, P., Chen, Z., Ren, Z., Lian, T., Ma, J.: Neural attentive session-based recommendation. In: Proceedings of the 26th ACM International Conference on Information and Knowledge Management, pp. 1419–1428 (2017)

  8. Liu, Q., Zeng, Y., Mokhosi, R., Zhang, H.: Stamp: short-term attention/memory priority model for session-based recommendation. In: Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 1831–1839 (2018)

  9. Kang, W.-C., McAuley, J.: Self-attentive sequential recommendation. In: Proceedings of the 2018 IEEE 18th International Conference on Data Mining, pp. 197–206 (2018)

  10. Liu, Q., Wu, S., Wang, D., Li, Z., Wang, L.: Context-aware sequential recommendation. In: Proceedings of the 2016 IEEE 16th International Conference on Data Mining, pp. 1053–1058 (2016)

  11. Zhu, Y., Li, H., Liao, Y., Wang, B., Guan, Z., Liu, H., Cai, D.: What to do next: Modeling user behaviors by time-lstm. In: Proceedings of the 26th International Joint Conference on Artificial Intelligence, pp. 3602–3608 (2017)

  12. Zhao, P., Zhu, H., Liu, Y., Xu, J., Li, Z., Zhuang, F., Sheng, V.S., Zhou, X.: Where to go next: A spatio-temporal gated network for next poi recommendation. In: Proceedings of the 33rd AAAI Conference on Artificial Intelligence, pp. 5877–5884 (2019)

  13. Sun, K., Qian, T., Chen, X., Zhong, M.: Context-aware seq2seq translation model for sequential recommendation. Inform. Sci. 581, 60–72 (2021)

    Article  MathSciNet  Google Scholar 

  14. Qiu, X., Sun, T., Xu, Y., Shao, Y., Dai, N., Huang, X.: Pre-trained models for natural language processing: A survey. Science China Technological Sciences, 1–26 (2020)

  15. Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv:1810.04805 (2018)

  16. Sheng, X.-R., Zhao, L., Zhou, G., Ding, X., Dai, B., Luo, Q., Yang, S., Lv, J., Zhang, C., Deng, H., et al.: One model to serve all: Star topology adaptive recommender for multi-domain ctr prediction. In: Proceedings of the 30th ACM International Conference on Information and Knowledge Management, pp. 4104–4113 (2021)

  17. Hao, X., Liu, Y., Xie, R., Ge, K., Tang, L., Zhang, X., Lin, L.: Adversarial feature translation for multi-domain recommendation. In: Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 2964–2973 (2021)

  18. Tang, H., Liu, J., Zhao, M., Gong, X.: Progressive layered extraction (ple): A novel multi-task learning (mtl) model for personalized recommendations. In: 14th ACM Conference on Recommender Systems, pp. 269–278 (2020)

  19. Wang, H., Chang, T.-W., Liu, T., Huang, J., Chen, Z., Yu, C., Li, R., Chu, W.: Escm2: Entire space counterfactual multi-task model for post-click conversion rate estimation. In: Proceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 363–372 (2022)

  20. Zhang, Q., Liu, J., Dai, Y., Qi, Y., Yuan, Y., Zheng, K., Huang, F., Tan, X.: Multi-task fusion via reinforcement learning for long-term user satisfaction in recommender systems. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 4510–4520 (2022)

  21. Sun, F., Liu, J., Wu, J., Pei, C., Lin, X., Ou, W., Jiang, P.: Bert4rec: Sequential recommendation with bidirectional encoder representations from transformer. In: Proceedings of the 28th ACM International Conference on Information and Knowledge Management, pp. 1441–1450 (2019)

  22. Yuan, X., Chen, H., Song, Y., Zhao, X., Ding, Z., He, Z., Long, B.: Improving sequential recommendation consistency with self-supervised imitation. In: Proceedings of the 30th International Joint Conference on Artificial Intelligence, pp. 3321–3327 (2021)

  23. Zhou, K., Wang, H., Zhao, W.X., Zhu, Y., Wang, S., Zhang, F., Wang, Z., Wen, J.-R.: S3-rec: Self-supervised learning for sequential recommendation with mutual information maximization. In: Proceedings of the 29th ACM International Conference on Information and Knowledge Management, pp. 1893–1902 (2020)

  24. Yuan, F., He, X., Jiang, H., Guo, G., Xiong, J., Xu, Z., Xiong, Y.: Future data helps training: Modeling future contexts for session-based recommendation. In: Proceedings of the 29th International Conference on World Wide Web, pp. 303–313 (2020)

  25. Wang, C., Zhang, M., Ma, W., Liu, Y., Ma, S.: Modeling item-specific temporal dynamics of repeat consumption for recommender systems. In: Proceedings of the 28th International Conference on World Wide Web, pp. 1977–1987 (2019)

  26. Wang, C., Zhang, M., Ma, W., Liu, Y., Ma, S.: Make it a chorus: knowledge- and time-aware item modeling for sequential recommendation. In: Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 109–118 (2020)

  27. Bai, T., Nie, J.-Y., Zhao, W.X., Zhu, Y., Du, P., Wen, J.-R.: An attribute-aware neural attentive model for next basket recommendation. In: Proceedings of the 41st International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 1201–1204 (2018)

  28. Cheng, C., Yang, H., Lyu, M.R., King, I.: Where you like to go next: Successive point-of-interest recommendation. In: Proceedings of the 23th International Joint Conference on Artificial Intelligence (2013)

  29. Wang, P., Guo, J., Lan, Y., Xu, J., Wan, S., Cheng, X.: Learning hierarchical representation model for nextbasket recommendation. In: Proceedings of the 38th International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 403–412 (2015)

  30. Yu, Z., Lian, J., Mahmoody, A., Liu, G., Xie, X.: Adaptive user modeling with long and short-term preferences for personalized recommendation. In: Proceedings of the 28th International Joint Conference on Artificial Intelligence, pp. 4213–4219 (2019)

  31. Beutel, A., Covington, P., Jain, S., Xu, C., Li, J., Gatto, V., Chi, E.H.: Latent cross: Making use of context in recurrent recommender systems. In: Proceedings of the 11th ACM International Conference on Web Search and Data Mining, pp. 46–54 (2018)

  32. Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of the Advances in Neural Information Processing Systems, pp. 5998–6008 (2017)

  33. Zhou, C., Bai, J., Song, J., Liu, X., Zhao, Z., Chen, X., Gao, J.: Atrank: An attention-based user behavior modeling framework for recommendation. In: Proceedings of the 32nd AAAI Conference on Artificial Intelligence, pp. 4564–4571 (2018)

  34. Zhou, G., Mou, N., Fan, Y., Pi, Q., Bian, W., Zhou, C., Zhu, X., Gai, K.: Deep interest evolution network for click-through rate prediction. In: Proceedings of the 33rd AAAI Conference on Artificial Intelligence, pp. 5941–5948 (2019)

  35. Li, J., Wang, Y., McAuley, J.: Time interval aware self-attention for sequential recommendation. In: Proceedings of the 13th ACM International Conference on Web Search and Data Mining, pp. 322–330 (2020)

  36. Yuan, E., Guo, W., He, Z., Guo, H., Liu, C., Tang, R.: Multi-behavior sequential transformer recommender. In: Proceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 1642–1652 (2022)

  37. Wang, S., Zhang, M., Miao, H., Peng, Z., Yu, P.S.: Multivariate correlation-aware spatio-temporal graph convolutional networks for multi-scale traffic prediction. ACM Trans. Intell. Syst. Technol. 13(3), 1–22 (2022)

    Article  Google Scholar 

  38. Wang, S., Cao, J., Yu, P.: Deep learning for spatio-temporal data mining: A survey. IEEE Trans. Knowl. Data Eng. 34(8), 3681–3700 (2020)

    Article  Google Scholar 

  39. Peng, H., Li, J., Wang, S., Wang, L., Gong, Q., Yang, R., Li, B., Philip, S.Y., He, L.: Hierarchical taxonomy-aware and attentional graph capsule rcnns for large-scale multi-label text classification. IEEE Trans. Knowl. Data Eng. 33(6), 2505–2519 (2019)

    Article  Google Scholar 

  40. Ma, M., Ren, P., Chen, Z., Ren, Z., Zhao, L., Liu, P., Ma, J., de Rijke, M.: Mixed information flow for cross-domain sequential recommendations. ACM Trans. Knowl. Disc. Data 16(4), 1–32 (2022)

    Article  Google Scholar 

  41. Wu, S., Tang, Y., Zhu, Y., Wang, L., Xie, X., Tan, T.: Session-based recommendation with graph neural networks. In: Proceedings of the 33rd AAAI Conference on Artificial Intelligence, pp. 346–353 (2019)

  42. Xia, X., Yin, H., Yu, J., Wang, Q., Cui, L., Zhang, X.: Self-supervised hypergraph convolutional networks for session-based recommendation. In: Proceedings of the 35th AAAI Conference on Artificial Intelligence, pp. 4503–4511 (2021)

  43. Ma, C., Kang, P., Liu, X.: Hierarchical gating networks for sequential recommendation. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 825–833 (2019)

  44. Chen, X., Xu, H., Zhang, Y., Tang, J., Cao, Y., Qin, Z., Zha, H.: Sequential recommendation with user memory networks. In: Proceedings of the 11th ACM International Conference on Web Search and Data Mining, pp. 108–116 (2018)

  45. Tan, Q., Zhang, J., Liu, N., Huang, X., Yang, H., Zhou, J., Hu, X.: Dynamic memory based attention network for sequential recommendation. In: Proceedings of the 35th AAAI Conference on Artificial Intelligence, pp. 4384–4392 (2021)

  46. Pi, Q., Bian, W., Zhou, G., Zhu, X., Gai, K.: Practice on long sequential user behavior modeling for click-through rate prediction. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 2671–2679 (2019)

  47. Huang, J., Ren, Z., Zhao, W.X., He, G., Wen, J.-R., Dong, D.: Taxonomy-aware multi-hop reasoning networks for sequential recommendation. In: Proceedings of the 12th ACM International Conference on Web Search and Data Mining, pp. 573–581 (2019)

  48. Yu, B., Li, X., Fang, J., Tai, C., Cheng, W., Xu, J.: Memory-augmented meta-learning framework for session-based target behavior recommendation. World Wide Web 26(1), 233–251 (2023)

    Article  Google Scholar 

  49. Tang, J., Wang, K.: Personalized top-n sequential recommendation via convolutional sequence embedding. In: Proceedings of the 11th ACM International Conference on Web Search and Data Mining, pp. 565–573 (2018)

  50. Yuan, F., Karatzoglou, A., Arapakis, I., Jose, J.M., He, X.: A simple convolutional generative network for next item recommendation. In: Proceedings of the 12th ACM International Conference on Web Search and Data Mining, pp. 582–590 (2019)

  51. Xu, C., Zhao, P., Liu, Y., Xu, J., S. Sheng, V.S.S., Cui, Z., Zhou, X., Xiong, H.: Recurrent convolutional neural network for sequential recommendation. In: Proceedings of the 28th International Conference on World Wide Web, pp. 3398–3404 (2019)

  52. Lin, Y., Wan, H., Guo, S., Lin, Y.: Pre-training context and time aware location embeddings from spatial-temporal trajectories for user next location prediction. In: Proceedings of the 35th AAAI Conference on Artificial Intelligence, pp. 4241–4248 (2021)

  53. Liu, Z., Fan, Z., Wang, Y., Yu, P.S.: Augmenting sequential recommendation with pseudo-prior items via reversely pre-training transformer. In: Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 1608–1612 (2021)

  54. Jiang, J., Luo, Y., Kim, J.B., Zhang, K., Kim, S.: Sequential recommendation with bidirectional chronological augmentation of transformer. arXiv:2112.06460 (2021)

  55. Ba, J.L., Kiros, J.R., Hinton, G.E.: Layer normalization. arXiv:1607.06450 (2016)

  56. Qiu, R., Huang, Z., Yin, H., Wang, Z.: Contrastive learning for representation degeneration problem in sequential recommendation. In: Proceedings of the 15th ACM International Conference on Web Search and Data Mining, pp. 813–823 (2022)

  57. Wang, C., Ma, W., Chen, C.: Sequential recommendation with multiple contrast signals. ACM Transactions on Information Systems (2022)

Download references

Acknowledgements

This work was supported by a grant from the National Natural Science Foundation of China (NSFC) project (No. 62276193) and a grant from the Key Laboratory of Satellite Information Intelligent Processing and Application Technology (No. 2022-ZZKY-JJ-16-01). It was also supported by the Joint Lab. on Credit Sci. and Tech. of CSCI-Wuhan University.

Funding

This work was supported by a grant from the National Natural Science Foundation of China (NSFC) project (No. 62276193). It was also supported by a grant from the Key Laboratory of Satellite Information Intelligent Processing and Application Technology (No. 2022-ZZKY-JJ-16-01).

Author information

Authors and Affiliations

Authors

Contributions

All authors contributed to the study conception and design. Material preparation, data collection and analysis were performed by Ke Sun and Tieyun Qian. The first draft of the manuscript was written by Ke Sun and revised by Tieyun Qian. Ming Zhong and Xuhui Li helped perform the analysis with constructive discussions. All authors read and approved the final manuscript.

Corresponding author

Correspondence to Tieyun Qian.

Ethics declarations

Financial interests

The authors declare they have no financial interests.

Non-financial interests

The authors declare they have no non-financial interests.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Sun, K., Qian, T., Zhong, M. et al. Towards more effective encoders in pre-training for sequential recommendation. World Wide Web 26, 2801–2832 (2023). https://doi.org/10.1007/s11280-023-01163-1

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11280-023-01163-1

Keywords

Navigation