Skip to main content

End-to-End Optimization of Quantization-Based Structure Learning and Interventional Next-Item Recommendation

  • Conference paper
  • First Online:
Artificial Intelligence (CICAI 2023)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 14473))

Included in the following conference series:

  • 168 Accesses

Abstract

With the development of deep learning, more and more related techniques are used in recommender system, making it more effective and reliable. However, due to the various distribution of real-time data, those deep-learning-based methods can merely learn the correlation between data rather than the actual causal effect, decreasing the performance of recommenders when a distribution shift occurs. Therefore, causal structure learning, which has been proposed to search for causal relationships between variables, is applied in recommender systems. However, existing methods assume that the recommender system is a non-interventional environment, making the causal graph learned not entirely correct. In this paper, we propose to decouple the recommender module and the causal module to consider the intervention of recommender system when building a causal graph. We utilize vector quantization to learn a cluster-level graph rather than an item-level graph to guarantee an acceptable training time. With an adjustable number of clusters, our model can adapt datasets of any size and be trained within a certain period. We conduct extensive experiments on both real-world and synthetic OOD datasets to demonstrate that our model is more effective than other state-of-the-art sequential recommenders.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 69.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 89.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    https://grouplens.org/datasets/movielens/1m/.

  2. 2.

    https://www.kaggle.com/code/laowingkin/netflix-movie-recommendation.

  3. 3.

    https://nijianmo.github.io/amazon/index.html.

  4. 4.

    https://grouplens.org/datasets/movielens/1m/.

  5. 5.

    https://nijianmo.github.io/amazon/index.html.

  6. 6.

    https://www.kaggle.com/code/laowingkin/netflix-movie-recommendation.

  7. 7.

    https://www.learndatasci.com/glossary/jaccard-similarity/.

References

  1. Cen, Y., Zhang, J., Zou, X., Zhou, C., Yang, H., Tang, J.: Controllable multi-interest framework for recommendation. In: Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 2942–2951 (2020)

    Google Scholar 

  2. Chen, W., Wu, Y., Cai, R., Chen, Y., Hao, Z.: CCSL: a causal structure learning method from multiple unknown environments. arXiv preprint arXiv:2111.09666 (2021)

  3. Covington, P., Adams, J., Sargin, E.: Deep neural networks for youtube recommendations. In: Proceedings of the 10th ACM Conference on Recommender Systems, pp. 191–198 (2016)

    Google Scholar 

  4. Dong, X., Yang, Y.: Searching for a robust neural architecture in four GPU hours. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 1761–1770 (2019)

    Google Scholar 

  5. Gao, C., et al.: Cross-domain recommendation with bridge-item embeddings. ACM Trans. Knowl. Discov. Data (TKDD) 16(1), 1–23 (2021)

    MathSciNet  Google Scholar 

  6. Gao, C., Zheng, Y., Wang, W., Feng, F., He, X., Li, Y.: Causal inference in recommender systems: a survey and future directions. arXiv preprint arXiv:2208.12397 (2022)

  7. He, R., Kang, W.C., McAuley, J.J., et al.: Translation-based recommendation: a scalable method for modeling sequential behavior. In: IJCAI, pp. 5264–5268 (2018)

    Google Scholar 

  8. He, X., Liao, L., Zhang, H., Nie, L., Hu, X., Chua, T.S.: Neural collaborative filtering. In: Proceedings of the 26th International Conference on World Wide Web, pp. 173–182 (2017)

    Google Scholar 

  9. Herawan, T., Noraziah, A., Abdullah, Z., Deris, M.M., Abawajy, J.H.: IPMA: indirect patterns mining algorithm. In: Advanced Methods for Computational Collective Intelligence, vol. 285, pp. 159–166 (2013)

    Google Scholar 

  10. Hidasi, B., Karatzoglou, A., Baltrunas, L., Tikk, D.: Session-based recommendations with recurrent neural networks. arXiv preprint arXiv:1511.06939 (2015)

  11. Hu, J., Shen, L., Sun, G.: Squeeze-and-excitation networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7132–7141 (2018)

    Google Scholar 

  12. Kalisch, M., Bühlman, P.: Estimating high-dimensional directed acyclic graphs with the PC-algorithm. J. Mach. Learn. Res. 8(3) (2007)

    Google Scholar 

  13. Kang, W.C., McAuley, J.: Self-attentive sequential recommendation. In: 2018 IEEE International Conference on Data Mining (ICDM), pp. 197–206. IEEE (2018)

    Google Scholar 

  14. Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014)

  15. Liu, Q., Liu, Z., Zhu, Z., Wu, S., Wang, L.: Deep stable multi-interest learning for out-of-distribution sequential recommendation. arXiv preprint arXiv:2304.05615 (2023)

  16. Ng, I., Zhu, S., Chen, Z., Fang, Z.: A graph autoencoder approach to causal structure learning. arXiv preprint arXiv:1911.07420 (2019)

  17. Tang, J., Wang, K.: Personalized top-n sequential recommendation via convolutional sequence embedding. In: Proceedings of the Eleventh ACM International Conference on Web Search and Data Mining, pp. 565–573 (2018)

    Google Scholar 

  18. Van Den Oord, A., Vinyals, O., et al.: Neural discrete representation learning. In: Advances in Neural Information Processing Systems, vol. 30 (2017)

    Google Scholar 

  19. Vaswani, A., et al.: Attention is all you need. In: Advances in Neural Information Processing Systems, vol. 30 (2017)

    Google Scholar 

  20. Wang, S., Cao, L.: Inferring implicit rules by learning explicit and hidden item dependency. IEEE Trans. Syst. Man Cybern. Syst. 50(3), 935–946 (2017)

    Article  Google Scholar 

  21. Wang, X., Zhang, R., Sun, Y., Qi, J.: Doubly robust joint learning for recommendation on data missing not at random. In: International Conference on Machine Learning, pp. 6638–6647. PMLR (2019)

    Google Scholar 

  22. Wang, Z., Chen, X., Dong, Z., Dai, Q., Wen, J.R.: Sequential recommendation with causal behavior discovery. arXiv preprint arXiv:2204.00216 (2022)

  23. Xu, S., et al.: Causal structure learning with recommendation system. arXiv preprint arXiv:2210.10256 (2022)

  24. Zhang, K., Glymour, M.R.: Unmixing for causal inference: thoughts on mccaffrey and danks. Br. J. Philos. Sci. (2020)

    Google Scholar 

  25. Zhang, S., et al.: Personalized latent structure learning for recommendation. IEEE Trans. Pattern Anal. Mach. Intell. (2023)

    Google Scholar 

  26. Zhang, S., et al.: Video-audio domain generalization via confounder disentanglement. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 37, pp. 15322–15330 (2023)

    Google Scholar 

  27. Zhang, S., et al.: Devlbert: learning deconfounded visio-linguistic representations. In: Proceedings of the 28th ACM International Conference on Multimedia, pp. 4373–4382 (2020)

    Google Scholar 

  28. Zhang, S., Yao, D., Zhao, Z., Chua, T.S., Wu, F.: Causerec: counterfactual user sequence synthesis for sequential recommendation. In: Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 367–377 (2021)

    Google Scholar 

  29. Zhang, S., Tay, Y., Yao, L., Sun, A.: Next item recommendation with self-attention. arXiv preprint arXiv:1808.06414 (2018)

  30. Zheng, X., Aragam, B., Ravikumar, P.K., Xing, E.P.: DAGs with no tears: continuous optimization for structure learning. In: Advances in Neural Information Processing Systems, vol. 31 (2018)

    Google Scholar 

  31. Zhou, G., et al.: Deep interest network for click-through rate prediction. In: Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1059–1068 (2018)

    Google Scholar 

  32. Zhou, K., Yu, H., Zhao, W.X., Wen, J.R.: Filter-enhanced MLP is all you need for sequential recommendation. In: Proceedings of the ACM Web Conference 2022, pp. 2388–2399 (2022)

    Google Scholar 

  33. Zhu, Y., Huang, B., Jiang, S., Yang, M., Yang, Y., Zhong, W.: Progressive self-attention network with unsymmetrical positional encoding for sequential recommendation. In: Proceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 2029–2033 (2022)

    Google Scholar 

Download references

Acknowledgement

This work was supported in part by National Natural Science Foundation of China (62006207, U20A20387), Young Elite Scientists Sponsorship Program by CAST (2021QNRC001), Zhejiang Province Natural Science Foundation (LQ21F020020), and the Fundamental Research Funds for the Central Universities (226-2022-00142, 226-2022-00051).

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to Shengyu Zhang or Kun Kuang .

Editor information

Editors and Affiliations

A Appendix

A Appendix

1.1 A.1 Experimental Setup

Datasets. We conduct experiments based on the following benchmark datasets, of which the statistics are shown in Table 4:

  • Movielens-1MFootnote 4 contains anonymous ratings of movies made by MovieLens users who joined MovieLens in 2000. Besides, it also includes the metadata of each movie, such as the categories it belongs to.

  • Amazon-MusicFootnote 5 is an e-commerce dataset collected from Amazon.com, which also includes the category information of a product.

  • NetflixFootnote 6 is another movie rating file constructed to support participants in the Netflix Prize. In this experiment, We use the preprocessed dataset provided in NATR [5].

In order to keep dataset quality, We discard items with fewer than 20 related appearances, while for user filtering, we adopt 5-core settings for Amazon and Netflix and 20-core settings for Movielens due to their different sparsity.

For the three datasets above, we sort the user-item interactions in chronological order and divide the training and testing sets according to the ratio of 8:2. In our experiments, the validation set is consistent with the test set. For each interaction, We regard those feedback with positive ratings as positive.

OOD Datasets. Following the data splitting paradigm of DESMIL [15], we build the OOD datasets by adjusting the distribution of user groups in the training set and test set. We randomly select a user and calculate the Jaccard similarityFootnote 7 between other users and the selected user. Then we choose 80% users with the smallest similarity as the training set and use the whole dataset for testing, where the training set and testing set contain different user groups.

Table 4. Statistics of the datasets.

Baselines. For the implementation of our method, we choose YoutubeDNN [3] as the base model, which is a classic deep learning recommendation system. Since our model is a sequential recommendation model, we ignore the graph recommendation model and focus on the sequential recommendation model. As for those collaborative filtering (CF) recommendation models, we adjust them to apply the historical sequence to the model. The compared models are shown as the following:

  • YoutubeDNN [3]. As one of the industry’s most commonly used recommender systems, it represents each user with his historical interaction data and combines them to get the final predictive score.

  • GRU4Rec [10]. GRU4Rec is a representative recommendation model in session based recommendation which adopts recurrent neural networks to train the relationship between target items and item sequences.

  • NeuMF [8]. NeuMF combines traditional matrix decomposition and multi-layer perceptron to extract both low- and high-dimensional features simultaneously.

  • SASRec [13]. SASRec applies self-attention mechanisms in sequential recommendation to help learn more valuable information.

  • Caser [17]. Caser is a state-of-the-art model that utilizes vertical and horizontal convolutional networks to extract short-term preferences from user interaction sequences.

  • AttRec [29]. Similar to SASRec, AttRec makes use of self-attention mechanisms to capture long-term preference. Moreover, it uses metric learning to model the consequences of short-term interests versus long-term preferences.

  • PAUP [33]. PAUP proposes a down-sampling convolution module and an unsymmetrical positional encoding strategy to effectively and efficiently capture both short- and long-term patterns.

  • FMLP-Rec [32]. FMLP-Rec makes use of filtering algorithms in signal processing instead of the traditional transformer to relieve the noise in the original dataset.

Evaluation Metrics. The three evaluation metrics we used are the same as the framework in ComiRec [1]. In recommendation system, accuracy, that is, whether what the user wants to see is correctly recommended to the user, is crucial. Thus we use Recall and Hit Rate to evaluate the accuracy of a model. Another metric used in our experiment is NDCG(Normalized Discounted Cumulative Gain), which pays more attention to the ratings in the recommendation list of items desired by users and is more convincing than recall. The higher the scores of the three indicators, the better the effect of the recommendation model.

Implementation Details. For all the models mentioned above, we use Adam [14] as the optimizer and 0.001 as the learning rate. All of them are implemented with an item embedding with size \(d=64\). The number of horizontal and vertical convolution filters are 4 and 16 in Caser, respectively. For each sequence in Caser, AttRec, and SASRec, we randomly sample 10 items not interacted with by the user as negative samples. While for GRU4Rec, we follow the method used in the original paper, namely, using other items in the same batch except for the target item as negative samples. For the NeuMF baseline, we create an embedding for each user with the history of interactions over time to adapt sequential recommendations.

Rights and permissions

Reprints and permissions

Copyright information

© 2024 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Fu, K., Miao, Q., Zhang, S., Kuang, K., Wu, F. (2024). End-to-End Optimization of Quantization-Based Structure Learning and Interventional Next-Item Recommendation. In: Fang, L., Pei, J., Zhai, G., Wang, R. (eds) Artificial Intelligence. CICAI 2023. Lecture Notes in Computer Science(), vol 14473. Springer, Singapore. https://doi.org/10.1007/978-981-99-8850-1_34

Download citation

  • DOI: https://doi.org/10.1007/978-981-99-8850-1_34

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-99-8849-5

  • Online ISBN: 978-981-99-8850-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics