Abstract
With the development of deep learning, more and more related techniques are used in recommender system, making it more effective and reliable. However, due to the various distribution of real-time data, those deep-learning-based methods can merely learn the correlation between data rather than the actual causal effect, decreasing the performance of recommenders when a distribution shift occurs. Therefore, causal structure learning, which has been proposed to search for causal relationships between variables, is applied in recommender systems. However, existing methods assume that the recommender system is a non-interventional environment, making the causal graph learned not entirely correct. In this paper, we propose to decouple the recommender module and the causal module to consider the intervention of recommender system when building a causal graph. We utilize vector quantization to learn a cluster-level graph rather than an item-level graph to guarantee an acceptable training time. With an adjustable number of clusters, our model can adapt datasets of any size and be trained within a certain period. We conduct extensive experiments on both real-world and synthetic OOD datasets to demonstrate that our model is more effective than other state-of-the-art sequential recommenders.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
- 2.
- 3.
- 4.
- 5.
- 6.
- 7.
References
Cen, Y., Zhang, J., Zou, X., Zhou, C., Yang, H., Tang, J.: Controllable multi-interest framework for recommendation. In: Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 2942–2951 (2020)
Chen, W., Wu, Y., Cai, R., Chen, Y., Hao, Z.: CCSL: a causal structure learning method from multiple unknown environments. arXiv preprint arXiv:2111.09666 (2021)
Covington, P., Adams, J., Sargin, E.: Deep neural networks for youtube recommendations. In: Proceedings of the 10th ACM Conference on Recommender Systems, pp. 191–198 (2016)
Dong, X., Yang, Y.: Searching for a robust neural architecture in four GPU hours. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 1761–1770 (2019)
Gao, C., et al.: Cross-domain recommendation with bridge-item embeddings. ACM Trans. Knowl. Discov. Data (TKDD) 16(1), 1–23 (2021)
Gao, C., Zheng, Y., Wang, W., Feng, F., He, X., Li, Y.: Causal inference in recommender systems: a survey and future directions. arXiv preprint arXiv:2208.12397 (2022)
He, R., Kang, W.C., McAuley, J.J., et al.: Translation-based recommendation: a scalable method for modeling sequential behavior. In: IJCAI, pp. 5264–5268 (2018)
He, X., Liao, L., Zhang, H., Nie, L., Hu, X., Chua, T.S.: Neural collaborative filtering. In: Proceedings of the 26th International Conference on World Wide Web, pp. 173–182 (2017)
Herawan, T., Noraziah, A., Abdullah, Z., Deris, M.M., Abawajy, J.H.: IPMA: indirect patterns mining algorithm. In: Advanced Methods for Computational Collective Intelligence, vol. 285, pp. 159–166 (2013)
Hidasi, B., Karatzoglou, A., Baltrunas, L., Tikk, D.: Session-based recommendations with recurrent neural networks. arXiv preprint arXiv:1511.06939 (2015)
Hu, J., Shen, L., Sun, G.: Squeeze-and-excitation networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7132–7141 (2018)
Kalisch, M., Bühlman, P.: Estimating high-dimensional directed acyclic graphs with the PC-algorithm. J. Mach. Learn. Res. 8(3) (2007)
Kang, W.C., McAuley, J.: Self-attentive sequential recommendation. In: 2018 IEEE International Conference on Data Mining (ICDM), pp. 197–206. IEEE (2018)
Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014)
Liu, Q., Liu, Z., Zhu, Z., Wu, S., Wang, L.: Deep stable multi-interest learning for out-of-distribution sequential recommendation. arXiv preprint arXiv:2304.05615 (2023)
Ng, I., Zhu, S., Chen, Z., Fang, Z.: A graph autoencoder approach to causal structure learning. arXiv preprint arXiv:1911.07420 (2019)
Tang, J., Wang, K.: Personalized top-n sequential recommendation via convolutional sequence embedding. In: Proceedings of the Eleventh ACM International Conference on Web Search and Data Mining, pp. 565–573 (2018)
Van Den Oord, A., Vinyals, O., et al.: Neural discrete representation learning. In: Advances in Neural Information Processing Systems, vol. 30 (2017)
Vaswani, A., et al.: Attention is all you need. In: Advances in Neural Information Processing Systems, vol. 30 (2017)
Wang, S., Cao, L.: Inferring implicit rules by learning explicit and hidden item dependency. IEEE Trans. Syst. Man Cybern. Syst. 50(3), 935–946 (2017)
Wang, X., Zhang, R., Sun, Y., Qi, J.: Doubly robust joint learning for recommendation on data missing not at random. In: International Conference on Machine Learning, pp. 6638–6647. PMLR (2019)
Wang, Z., Chen, X., Dong, Z., Dai, Q., Wen, J.R.: Sequential recommendation with causal behavior discovery. arXiv preprint arXiv:2204.00216 (2022)
Xu, S., et al.: Causal structure learning with recommendation system. arXiv preprint arXiv:2210.10256 (2022)
Zhang, K., Glymour, M.R.: Unmixing for causal inference: thoughts on mccaffrey and danks. Br. J. Philos. Sci. (2020)
Zhang, S., et al.: Personalized latent structure learning for recommendation. IEEE Trans. Pattern Anal. Mach. Intell. (2023)
Zhang, S., et al.: Video-audio domain generalization via confounder disentanglement. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 37, pp. 15322–15330 (2023)
Zhang, S., et al.: Devlbert: learning deconfounded visio-linguistic representations. In: Proceedings of the 28th ACM International Conference on Multimedia, pp. 4373–4382 (2020)
Zhang, S., Yao, D., Zhao, Z., Chua, T.S., Wu, F.: Causerec: counterfactual user sequence synthesis for sequential recommendation. In: Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 367–377 (2021)
Zhang, S., Tay, Y., Yao, L., Sun, A.: Next item recommendation with self-attention. arXiv preprint arXiv:1808.06414 (2018)
Zheng, X., Aragam, B., Ravikumar, P.K., Xing, E.P.: DAGs with no tears: continuous optimization for structure learning. In: Advances in Neural Information Processing Systems, vol. 31 (2018)
Zhou, G., et al.: Deep interest network for click-through rate prediction. In: Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1059–1068 (2018)
Zhou, K., Yu, H., Zhao, W.X., Wen, J.R.: Filter-enhanced MLP is all you need for sequential recommendation. In: Proceedings of the ACM Web Conference 2022, pp. 2388–2399 (2022)
Zhu, Y., Huang, B., Jiang, S., Yang, M., Yang, Y., Zhong, W.: Progressive self-attention network with unsymmetrical positional encoding for sequential recommendation. In: Proceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 2029–2033 (2022)
Acknowledgement
This work was supported in part by National Natural Science Foundation of China (62006207, U20A20387), Young Elite Scientists Sponsorship Program by CAST (2021QNRC001), Zhejiang Province Natural Science Foundation (LQ21F020020), and the Fundamental Research Funds for the Central Universities (226-2022-00142, 226-2022-00051).
Author information
Authors and Affiliations
Corresponding authors
Editor information
Editors and Affiliations
A Appendix
A Appendix
1.1 A.1 Experimental Setup
Datasets. We conduct experiments based on the following benchmark datasets, of which the statistics are shown in Table 4:
-
Movielens-1MFootnote 4 contains anonymous ratings of movies made by MovieLens users who joined MovieLens in 2000. Besides, it also includes the metadata of each movie, such as the categories it belongs to.
-
Amazon-MusicFootnote 5 is an e-commerce dataset collected from Amazon.com, which also includes the category information of a product.
-
NetflixFootnote 6 is another movie rating file constructed to support participants in the Netflix Prize. In this experiment, We use the preprocessed dataset provided in NATR [5].
In order to keep dataset quality, We discard items with fewer than 20 related appearances, while for user filtering, we adopt 5-core settings for Amazon and Netflix and 20-core settings for Movielens due to their different sparsity.
For the three datasets above, we sort the user-item interactions in chronological order and divide the training and testing sets according to the ratio of 8:2. In our experiments, the validation set is consistent with the test set. For each interaction, We regard those feedback with positive ratings as positive.
OOD Datasets. Following the data splitting paradigm of DESMIL [15], we build the OOD datasets by adjusting the distribution of user groups in the training set and test set. We randomly select a user and calculate the Jaccard similarityFootnote 7 between other users and the selected user. Then we choose 80% users with the smallest similarity as the training set and use the whole dataset for testing, where the training set and testing set contain different user groups.
Baselines. For the implementation of our method, we choose YoutubeDNN [3] as the base model, which is a classic deep learning recommendation system. Since our model is a sequential recommendation model, we ignore the graph recommendation model and focus on the sequential recommendation model. As for those collaborative filtering (CF) recommendation models, we adjust them to apply the historical sequence to the model. The compared models are shown as the following:
-
YoutubeDNN [3]. As one of the industry’s most commonly used recommender systems, it represents each user with his historical interaction data and combines them to get the final predictive score.
-
GRU4Rec [10]. GRU4Rec is a representative recommendation model in session based recommendation which adopts recurrent neural networks to train the relationship between target items and item sequences.
-
NeuMF [8]. NeuMF combines traditional matrix decomposition and multi-layer perceptron to extract both low- and high-dimensional features simultaneously.
-
SASRec [13]. SASRec applies self-attention mechanisms in sequential recommendation to help learn more valuable information.
-
Caser [17]. Caser is a state-of-the-art model that utilizes vertical and horizontal convolutional networks to extract short-term preferences from user interaction sequences.
-
AttRec [29]. Similar to SASRec, AttRec makes use of self-attention mechanisms to capture long-term preference. Moreover, it uses metric learning to model the consequences of short-term interests versus long-term preferences.
-
PAUP [33]. PAUP proposes a down-sampling convolution module and an unsymmetrical positional encoding strategy to effectively and efficiently capture both short- and long-term patterns.
-
FMLP-Rec [32]. FMLP-Rec makes use of filtering algorithms in signal processing instead of the traditional transformer to relieve the noise in the original dataset.
Evaluation Metrics. The three evaluation metrics we used are the same as the framework in ComiRec [1]. In recommendation system, accuracy, that is, whether what the user wants to see is correctly recommended to the user, is crucial. Thus we use Recall and Hit Rate to evaluate the accuracy of a model. Another metric used in our experiment is NDCG(Normalized Discounted Cumulative Gain), which pays more attention to the ratings in the recommendation list of items desired by users and is more convincing than recall. The higher the scores of the three indicators, the better the effect of the recommendation model.
Implementation Details. For all the models mentioned above, we use Adam [14] as the optimizer and 0.001 as the learning rate. All of them are implemented with an item embedding with size \(d=64\). The number of horizontal and vertical convolution filters are 4 and 16 in Caser, respectively. For each sequence in Caser, AttRec, and SASRec, we randomly sample 10 items not interacted with by the user as negative samples. While for GRU4Rec, we follow the method used in the original paper, namely, using other items in the same batch except for the target item as negative samples. For the NeuMF baseline, we create an embedding for each user with the history of interactions over time to adapt sequential recommendations.
Rights and permissions
Copyright information
© 2024 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Fu, K., Miao, Q., Zhang, S., Kuang, K., Wu, F. (2024). End-to-End Optimization of Quantization-Based Structure Learning and Interventional Next-Item Recommendation. In: Fang, L., Pei, J., Zhai, G., Wang, R. (eds) Artificial Intelligence. CICAI 2023. Lecture Notes in Computer Science(), vol 14473. Springer, Singapore. https://doi.org/10.1007/978-981-99-8850-1_34
Download citation
DOI: https://doi.org/10.1007/978-981-99-8850-1_34
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-99-8849-5
Online ISBN: 978-981-99-8850-1
eBook Packages: Computer ScienceComputer Science (R0)