End-to-End Optimization of Quantization-Based Structure Learning and Interventional Next-Item Recommendation

Fu, Kairui; Miao, Qiaowei; Zhang, Shengyu; Kuang, Kun; Wu, Fei

doi:10.1007/978-981-99-8850-1_34

Kairui Fu¹¹,
Qiaowei Miao¹¹,
Shengyu Zhang¹¹,
Kun Kuang^11,14 &
…
Fei Wu^11,12,13

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 14473))

Included in the following conference series:

CAAI International Conference on Artificial Intelligence

168 Accesses

Abstract

With the development of deep learning, more and more related techniques are used in recommender system, making it more effective and reliable. However, due to the various distribution of real-time data, those deep-learning-based methods can merely learn the correlation between data rather than the actual causal effect, decreasing the performance of recommenders when a distribution shift occurs. Therefore, causal structure learning, which has been proposed to search for causal relationships between variables, is applied in recommender systems. However, existing methods assume that the recommender system is a non-interventional environment, making the causal graph learned not entirely correct. In this paper, we propose to decouple the recommender module and the causal module to consider the intervention of recommender system when building a causal graph. We utilize vector quantization to learn a cluster-level graph rather than an item-level graph to guarantee an acceptable training time. With an adjustable number of clusters, our model can adapt datasets of any size and be trained within a certain period. We conduct extensive experiments on both real-world and synthetic OOD datasets to demonstrate that our model is more effective than other state-of-the-art sequential recommenders.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 69.99; Price excludes VAT (USA)

Softcover Book: USD 89.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

References

Cen, Y., Zhang, J., Zou, X., Zhou, C., Yang, H., Tang, J.: Controllable multi-interest framework for recommendation. In: Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 2942–2951 (2020)
Google Scholar
Chen, W., Wu, Y., Cai, R., Chen, Y., Hao, Z.: CCSL: a causal structure learning method from multiple unknown environments. arXiv preprint arXiv:2111.09666 (2021)
Covington, P., Adams, J., Sargin, E.: Deep neural networks for youtube recommendations. In: Proceedings of the 10th ACM Conference on Recommender Systems, pp. 191–198 (2016)
Google Scholar
Dong, X., Yang, Y.: Searching for a robust neural architecture in four GPU hours. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 1761–1770 (2019)
Google Scholar
Gao, C., et al.: Cross-domain recommendation with bridge-item embeddings. ACM Trans. Knowl. Discov. Data (TKDD) 16(1), 1–23 (2021)
MathSciNet Google Scholar
Gao, C., Zheng, Y., Wang, W., Feng, F., He, X., Li, Y.: Causal inference in recommender systems: a survey and future directions. arXiv preprint arXiv:2208.12397 (2022)
He, R., Kang, W.C., McAuley, J.J., et al.: Translation-based recommendation: a scalable method for modeling sequential behavior. In: IJCAI, pp. 5264–5268 (2018)
Google Scholar
He, X., Liao, L., Zhang, H., Nie, L., Hu, X., Chua, T.S.: Neural collaborative filtering. In: Proceedings of the 26th International Conference on World Wide Web, pp. 173–182 (2017)
Google Scholar
Herawan, T., Noraziah, A., Abdullah, Z., Deris, M.M., Abawajy, J.H.: IPMA: indirect patterns mining algorithm. In: Advanced Methods for Computational Collective Intelligence, vol. 285, pp. 159–166 (2013)
Google Scholar
Hidasi, B., Karatzoglou, A., Baltrunas, L., Tikk, D.: Session-based recommendations with recurrent neural networks. arXiv preprint arXiv:1511.06939 (2015)
Hu, J., Shen, L., Sun, G.: Squeeze-and-excitation networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7132–7141 (2018)
Google Scholar
Kalisch, M., Bühlman, P.: Estimating high-dimensional directed acyclic graphs with the PC-algorithm. J. Mach. Learn. Res. 8(3) (2007)
Google Scholar
Kang, W.C., McAuley, J.: Self-attentive sequential recommendation. In: 2018 IEEE International Conference on Data Mining (ICDM), pp. 197–206. IEEE (2018)
Google Scholar
Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014)
Liu, Q., Liu, Z., Zhu, Z., Wu, S., Wang, L.: Deep stable multi-interest learning for out-of-distribution sequential recommendation. arXiv preprint arXiv:2304.05615 (2023)
Ng, I., Zhu, S., Chen, Z., Fang, Z.: A graph autoencoder approach to causal structure learning. arXiv preprint arXiv:1911.07420 (2019)
Tang, J., Wang, K.: Personalized top-n sequential recommendation via convolutional sequence embedding. In: Proceedings of the Eleventh ACM International Conference on Web Search and Data Mining, pp. 565–573 (2018)
Google Scholar
Van Den Oord, A., Vinyals, O., et al.: Neural discrete representation learning. In: Advances in Neural Information Processing Systems, vol. 30 (2017)
Google Scholar
Vaswani, A., et al.: Attention is all you need. In: Advances in Neural Information Processing Systems, vol. 30 (2017)
Google Scholar
Wang, S., Cao, L.: Inferring implicit rules by learning explicit and hidden item dependency. IEEE Trans. Syst. Man Cybern. Syst. 50(3), 935–946 (2017)
Article Google Scholar
Wang, X., Zhang, R., Sun, Y., Qi, J.: Doubly robust joint learning for recommendation on data missing not at random. In: International Conference on Machine Learning, pp. 6638–6647. PMLR (2019)
Google Scholar
Wang, Z., Chen, X., Dong, Z., Dai, Q., Wen, J.R.: Sequential recommendation with causal behavior discovery. arXiv preprint arXiv:2204.00216 (2022)
Xu, S., et al.: Causal structure learning with recommendation system. arXiv preprint arXiv:2210.10256 (2022)
Zhang, K., Glymour, M.R.: Unmixing for causal inference: thoughts on mccaffrey and danks. Br. J. Philos. Sci. (2020)
Google Scholar
Zhang, S., et al.: Personalized latent structure learning for recommendation. IEEE Trans. Pattern Anal. Mach. Intell. (2023)
Google Scholar
Zhang, S., et al.: Video-audio domain generalization via confounder disentanglement. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 37, pp. 15322–15330 (2023)
Google Scholar
Zhang, S., et al.: Devlbert: learning deconfounded visio-linguistic representations. In: Proceedings of the 28th ACM International Conference on Multimedia, pp. 4373–4382 (2020)
Google Scholar
Zhang, S., Yao, D., Zhao, Z., Chua, T.S., Wu, F.: Causerec: counterfactual user sequence synthesis for sequential recommendation. In: Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 367–377 (2021)
Google Scholar
Zhang, S., Tay, Y., Yao, L., Sun, A.: Next item recommendation with self-attention. arXiv preprint arXiv:1808.06414 (2018)
Zheng, X., Aragam, B., Ravikumar, P.K., Xing, E.P.: DAGs with no tears: continuous optimization for structure learning. In: Advances in Neural Information Processing Systems, vol. 31 (2018)
Google Scholar
Zhou, G., et al.: Deep interest network for click-through rate prediction. In: Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1059–1068 (2018)
Google Scholar
Zhou, K., Yu, H., Zhao, W.X., Wen, J.R.: Filter-enhanced MLP is all you need for sequential recommendation. In: Proceedings of the ACM Web Conference 2022, pp. 2388–2399 (2022)
Google Scholar
Zhu, Y., Huang, B., Jiang, S., Yang, M., Yang, Y., Zhong, W.: Progressive self-attention network with unsymmetrical positional encoding for sequential recommendation. In: Proceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 2029–2033 (2022)
Google Scholar

Download references

Acknowledgement

This work was supported in part by National Natural Science Foundation of China (62006207, U20A20387), Young Elite Scientists Sponsorship Program by CAST (2021QNRC001), Zhejiang Province Natural Science Foundation (LQ21F020020), and the Fundamental Research Funds for the Central Universities (226-2022-00142, 226-2022-00051).

Author information

Authors and Affiliations

Institute of Artificial Intelligence, Zhejiang University, Hangzhou, China
Kairui Fu, Qiaowei Miao, Shengyu Zhang, Kun Kuang & Fei Wu
Shanghai Institute for Advanced Study of Zhejiang University, Shanghai, China
Fei Wu
Shanghai AI Laboratory, Shanghai, China
Fei Wu
Key Laboratory for Corneal Diseases Research of Zhejiang Province, Hangzhou, China
Kun Kuang

Authors

Kairui Fu
View author publications
You can also search for this author in PubMed Google Scholar
Qiaowei Miao
View author publications
You can also search for this author in PubMed Google Scholar
Shengyu Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Kun Kuang
View author publications
You can also search for this author in PubMed Google Scholar
Fei Wu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding authors

Correspondence to Shengyu Zhang or Kun Kuang .

Editor information

Editors and Affiliations

Tsinghua University, Beijing, China
Lu Fang
Duke University, Durham, NC, USA
Jian Pei
Shanghai Jiao Tong Univeristy, Shanghai, China
Guangtao Zhai
Chinese Academy of Sciences, Beijing, China
Ruiping Wang

A Appendix

1.1 A.1 Experimental Setup

Datasets. We conduct experiments based on the following benchmark datasets, of which the statistics are shown in Table 4:

Movielens-1M^{Footnote 4} contains anonymous ratings of movies made by MovieLens users who joined MovieLens in 2000. Besides, it also includes the metadata of each movie, such as the categories it belongs to.
Amazon-Music^{Footnote 5} is an e-commerce dataset collected from Amazon.com, which also includes the category information of a product.
Netflix^{Footnote 6} is another movie rating file constructed to support participants in the Netflix Prize. In this experiment, We use the preprocessed dataset provided in NATR [5].

In order to keep dataset quality, We discard items with fewer than 20 related appearances, while for user filtering, we adopt 5-core settings for Amazon and Netflix and 20-core settings for Movielens due to their different sparsity.

For the three datasets above, we sort the user-item interactions in chronological order and divide the training and testing sets according to the ratio of 8:2. In our experiments, the validation set is consistent with the test set. For each interaction, We regard those feedback with positive ratings as positive.

OOD Datasets. Following the data splitting paradigm of DESMIL [15], we build the OOD datasets by adjusting the distribution of user groups in the training set and test set. We randomly select a user and calculate the Jaccard similarity^{Footnote 7} between other users and the selected user. Then we choose 80% users with the smallest similarity as the training set and use the whole dataset for testing, where the training set and testing set contain different user groups.

Table 4. Statistics of the datasets.

Full size table

Baselines. For the implementation of our method, we choose YoutubeDNN [3] as the base model, which is a classic deep learning recommendation system. Since our model is a sequential recommendation model, we ignore the graph recommendation model and focus on the sequential recommendation model. As for those collaborative filtering (CF) recommendation models, we adjust them to apply the historical sequence to the model. The compared models are shown as the following:

YoutubeDNN [3]. As one of the industry’s most commonly used recommender systems, it represents each user with his historical interaction data and combines them to get the final predictive score.
GRU4Rec [10]. GRU4Rec is a representative recommendation model in session based recommendation which adopts recurrent neural networks to train the relationship between target items and item sequences.
NeuMF [8]. NeuMF combines traditional matrix decomposition and multi-layer perceptron to extract both low- and high-dimensional features simultaneously.
SASRec [13]. SASRec applies self-attention mechanisms in sequential recommendation to help learn more valuable information.
Caser [17]. Caser is a state-of-the-art model that utilizes vertical and horizontal convolutional networks to extract short-term preferences from user interaction sequences.
AttRec [29]. Similar to SASRec, AttRec makes use of self-attention mechanisms to capture long-term preference. Moreover, it uses metric learning to model the consequences of short-term interests versus long-term preferences.
PAUP [33]. PAUP proposes a down-sampling convolution module and an unsymmetrical positional encoding strategy to effectively and efficiently capture both short- and long-term patterns.
FMLP-Rec [32]. FMLP-Rec makes use of filtering algorithms in signal processing instead of the traditional transformer to relieve the noise in the original dataset.

Evaluation Metrics. The three evaluation metrics we used are the same as the framework in ComiRec [1]. In recommendation system, accuracy, that is, whether what the user wants to see is correctly recommended to the user, is crucial. Thus we use Recall and Hit Rate to evaluate the accuracy of a model. Another metric used in our experiment is NDCG(Normalized Discounted Cumulative Gain), which pays more attention to the ratings in the recommendation list of items desired by users and is more convincing than recall. The higher the scores of the three indicators, the better the effect of the recommendation model.

Implementation Details. For all the models mentioned above, we use Adam [14] as the optimizer and 0.001 as the learning rate. All of them are implemented with an item embedding with size \(d=64\). The number of horizontal and vertical convolution filters are 4 and 16 in Caser, respectively. For each sequence in Caser, AttRec, and SASRec, we randomly sample 10 items not interacted with by the user as negative samples. While for GRU4Rec, we follow the method used in the original paper, namely, using other items in the same batch except for the target item as negative samples. For the NeuMF baseline, we create an embedding for each user with the history of interactions over time to adapt sequential recommendations.

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Fu, K., Miao, Q., Zhang, S., Kuang, K., Wu, F. (2024). End-to-End Optimization of Quantization-Based Structure Learning and Interventional Next-Item Recommendation. In: Fang, L., Pei, J., Zhai, G., Wang, R. (eds) Artificial Intelligence. CICAI 2023. Lecture Notes in Computer Science(), vol 14473. Springer, Singapore. https://doi.org/10.1007/978-981-99-8850-1_34

Download citation

DOI: https://doi.org/10.1007/978-981-99-8850-1_34
Published: 04 February 2024
Publisher Name: Springer, Singapore
Print ISBN: 978-981-99-8849-5
Online ISBN: 978-981-99-8850-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

End-to-End Optimization of Quantization-Based Structure Learning and Interventional Next-Item Recommendation

Abstract

Access this chapter

Notes

References

Acknowledgement

Author information

Authors and Affiliations

Corresponding authors

Editor information

Editors and Affiliations

A Appendix

A Appendix

1.1 A.1 Experimental Setup

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation