skip to main content
10.1145/3477495.3531969acmconferencesArticle/Chapter ViewAbstractPublication PagesirConference Proceedingsconference-collections
research-article

Dynamics-Aware Adaptation for Reinforcement Learning Based Cross-Domain Interactive Recommendation

Published: 07 July 2022 Publication History

Abstract

Interactive recommender systems (IRS) have received wide attention in recent years. To capture users' dynamic preferences and maximize their long-term engagement, IRS are usually formulated as reinforcement learning (RL) problems. Despite the promise to solve complex decision-making problems, RL-based methods generally require a large amount of online interaction, restricting their applications due to economic considerations. One possible direction to alleviate this issue is cross-domain recommendation that aims to leverage abundant logged interaction data from a source domain (e.g., adventure genre in movie recommendation) to improve the recommendation quality in the target domain (e.g., crime genre). Nevertheless, prior studies mostly focus on adapting the static representations of users/items. Few have explored how the temporally dynamic user-item interaction patterns transform across domains.
Motivated by the above consideration, we propose DACIR, a novel Doubly-Adaptive deep RL-based framework for Cross-domain Interactive Recommendation. We first pinpoint how users behave differently in two domains and highlight the potential to leverage the shared user dynamics to boost IRS. To transfer static user preferences across domains, DACIR enforces consistency of item representation by aligning embeddings into a shared latent space. In addition, given the user dynamics in IRS, DACIR calibrates the dynamic interaction patterns in two domains via reward correlation. Once the double adaptation narrows the cross-domain gap, we are able to learn a transferable policy for the target recommender by leveraging logged data. Experiments on real-world datasets validate the superiority of our approach, which consistently achieves significant improvements over the baselines.

References

[1]
Sapumal Ahangama and Danny Chiang Choon Poo. 2019. Latent User Linking for Collaborative Cross Domain Recommendation. CoRR, Vol. abs/1908.06583 (2019). showeprint[arXiv]1908.06583 http://arxiv.org/abs/1908.06583
[2]
Hana Ajakan, Pascal Germain, Hugo Larochelle, Francc ois Laviolette, and Mario Marchand. 2014. Domain-adversarial neural networks. arXiv preprint arXiv:1412.4446 (2014).
[3]
Alex Beutel, Paul Covington, Sagar Jain, Can Xu, Jia Li, Vince Gatto, and Ed H Chi. 2018. Latent cross: Making use of context in recurrent recommender systems. In Proceedings of the Eleventh ACM International Conference on Web Search and Data Mining. 46--54.
[4]
Matthew Botvinick, Sam Ritter, Jane X Wang, Zeb Kurth-Nelson, Charles Blundell, and Demis Hassabis. 2019. Reinforcement learning, fast and slow. Trends in cognitive sciences, Vol. 23, 5 (2019), 408--422.
[5]
Iván Cantador and Paolo Cremonesi. 2014. Tutorial on cross-domain recommender systems. In Proceedings of the 8th ACM Conference on Recommender Systems. 401--402.
[6]
Minmin Chen, Alex Beutel, Paul Covington, Sagar Jain, Francois Belletti, and Ed H Chi. 2019. Top-k off-policy correction for a REINFORCE recommender system. In Proceedings of the Twelfth ACM International Conference on Web Search and Data Mining. 456--464.
[7]
Gabriel Dulac-Arnold, Richard Evans, Peter Sunehag, and Ben Coppin. 2015. Reinforcement Learning in Large Discrete Action Spaces. CoRR, Vol. abs/1512.07679 (2015). showeprint[arXiv]1512.07679 http://arxiv.org/abs/1512.07679
[8]
Ali Mamdouh Elkahky, Yang Song, and Xiaodong He. 2015. A multi-view deep learning approach for cross domain user modeling in recommendation systems. In Proceedings of the 24th international conference on world wide web. 278--288.
[9]
Benjamin Eysenbach, Shreyas Chaudhari, Swapnil Asawa, Sergey Levine, and Ruslan Salakhutdinov. 2021. Off-Dynamics Reinforcement Learning: Training for Transfer with Domain Classifiers. In International Conference on Learning Representations. https://openreview.net/forum?id=eqBwg3AcIAK
[10]
Tuomas Haarnoja, Aurick Zhou, Pieter Abbeel, and Sergey Levine. 2018. Soft actor-critic: Off-policy maximum entropy deep reinforcement learning with a stochastic actor. In International conference on machine learning. PMLR, 1861--1870.
[11]
Daniel Ho, Kanishka Rao, Zhuo Xu, Eric Jang, Mohi Khansari, and Yunfei Bai. 2021. Retinagan: An object-aware approach to sim-to-real transfer. In 2021 IEEE International Conference on Robotics and Automation (ICRA). IEEE, 10920--10926.
[12]
Sepp Hochreiter and Jürgen Schmidhuber. 1997. Long short-term memory. Neural computation, Vol. 9, 8 (1997), 1735--1780.
[13]
Judy Hoffman, Eric Tzeng, Taesung Park, Jun-Yan Zhu, Phillip Isola, Kate Saenko, Alexei Efros, and Trevor Darrell. 2018. Cycada: Cycle-consistent adversarial domain adaptation. In International conference on machine learning. PMLR, 1989--1998.
[14]
Liang Hu, Jian Cao, Guandong Xu, Longbing Cao, Zhiping Gu, and Can Zhu. 2013. Personalized recommendation via cross-domain triadic factorization. In Proceedings of the 22nd international conference on World Wide Web. 595--606.
[15]
Yujing Hu, Qing Da, Anxiang Zeng, Yang Yu, and Yinghui Xu. 2018. Reinforcement learning to rank in e-commerce search engine: Formalization, analysis, and application. In Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. 368--377.
[16]
Biwei Huang, Fan Feng, Chaochao Lu, Sara Magliacane, and Kun Zhang. 2022. AdaRL: What, Where, and How to Adapt in Transfer Reinforcement Learning. In International Conference on Learning Representations. https://openreview.net/forum?id=8H5bpVwvt5
[17]
Eugene Ie, Chih-wei Hsu, Martin Mladenov, Vihan Jain, Sanmit Narvekar, Jing Wang, Rui Wu, and Craig Boutilier. 2019 a. Recsim: A configurable simulation platform for recommender systems. arXiv preprint arXiv:1909.04847 (2019).
[18]
Eugene Ie, Vihan Jain, Jing Wang, Sanmit Narvekar, Ritesh Agarwal, Rui Wu, Heng-Tze Cheng, Tushar Chandra, and Craig Boutilier. 2019 b. SLATEQ: a tractable decomposition for reinforcement learning with recommendation sets. In Proceedings of the 28th International Joint Conference on Artificial Intelligence. 2592--2599.
[19]
Dietmar Jannach, Paul Resnick, Alexander Tuzhilin, and Markus Zanker. 2016. Recommender systems-beyond matrix completion. Commun. ACM, Vol. 59, 11 (2016), 94--102.
[20]
Sham Kakade and John Langford. 2002. Approximately optimal approximate reinforcement learning. In In Proc. 19th International Conference on Machine Learning. Citeseer.
[21]
Jens Kober, J Andrew Bagnell, and Jan Peters. 2013. Reinforcement learning in robotics: A survey. The International Journal of Robotics Research, Vol. 32, 11 (2013), 1238--1274.
[22]
Adit Krishnan, Mahashweta Das, Mangesh Bendre, Hao Yang, and Hari Sundaram. 2020. Transfer Learning via Contextual Invariants for One-to-Many Cross-Domain Recommendation. In Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval. 1081--1090.
[23]
Alexander Kuhnle, Miguel Aroca-Ouellette, Anindya Basu, Murat Sensoy, John Reid, and Dell Zhang. 2021. Reinforcement Learning for Information Retrieval. In Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval. 2669--2672.
[24]
Sergey Levine. 2018. Reinforcement Learning and Control as Probabilistic Inference: Tutorial and Review. CoRR, Vol. abs/1805.00909 (2018). showeprint[arXiv]1805.00909 http://arxiv.org/abs/1805.00909
[25]
Sergey Levine, Chelsea Finn, Trevor Darrell, and Pieter Abbeel. 2016. End-to-end training of deep visuomotor policies. The Journal of Machine Learning Research, Vol. 17, 1 (2016), 1334--1373.
[26]
Pan Li, Zhichao Jiang, Maofei Que, Yao Hu, and Alexander Tuzhilin. 2021. Dual Attentive Sequential Learning for Cross-Domain Click-Through Rate Prediction. (2021), 3172--3180. https://doi.org/10.1145/3447548.3467140
[27]
Ying Li, Jia-Jie Xu, Peng-Peng Zhao, Jun-Hua Fang, Wei Chen, and Lei Zhao. 2020. ATLRec: An attentional adversarial transfer learning network for cross-domain recommendation. Journal of Computer Science and Technology, Vol. 35, 4 (2020), 794--808.
[28]
Dawen Liang, Rahul G Krishnan, Matthew D Hoffman, and Tony Jebara. 2018. Variational autoencoders for collaborative filtering. In Proceedings of the 2018 world wide web conference. 689--698.
[29]
Feng Liu, Ruiming Tang, Xutao Li, Yunming Ye, Haokun Chen, Huifeng Guo, and Yuzhou Zhang. 2018. Deep Reinforcement Learning based Recommendation with Explicit User-Item Interactions Modeling. CoRR, Vol. abs/1810.12027 (2018). showeprint[arXiv]1810.12027 http://arxiv.org/abs/1810.12027
[30]
Mingsheng Long, Yue Cao, Jianmin Wang, and Michael Jordan. 2015. Learning transferable features with deep adaptation networks. In International conference on machine learning. PMLR, 97--105.
[31]
Babak Loni, Yue Shi, Martha Larson, and Alan Hanjalic. 2014. Cross-domain collaborative filtering with factorization machines. In European conference on information retrieval. Springer, 656--661.
[32]
Muyang Ma, Pengjie Ren, Yujie Lin, Zhumin Chen, Jun Ma, and Maarten de Rijke. 2019. π-Net: A parallel information-sharing network for shared-account cross-domain sequential recommendations. In Proceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval. 685--694.
[33]
Volodymyr Mnih, Koray Kavukcuoglu, David Silver, Alex Graves, Ioannis Antonoglou, Daan Wierstra, and Martin A. Riedmiller. 2013. Playing Atari with Deep Reinforcement Learning. CoRR, Vol. abs/1312.5602 (2013). showeprint[arXiv]1312.5602 http://arxiv.org/abs/1312.5602
[34]
Volodymyr Mnih, Koray Kavukcuoglu, David Silver, Andrei A Rusu, Joel Veness, Marc G Bellemare, Alex Graves, Martin Riedmiller, Andreas K Fidjeland, Georg Ostrovski, et almbox. 2015. Human-level control through deep reinforcement learning. nature, Vol. 518, 7540 (2015), 529--533.
[35]
Kanishka Rao, Chris Harris, Alex Irpan, Sergey Levine, Julian Ibarz, and Mohi Khansari. 2020. Rl-cyclegan: Reinforcement learning aware simulation-to-real. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 11157--11166.
[36]
David Rohde, Stephen Bonner, Travis Dunlop, Flavian Vasile, and Alexandros Karatzoglou. 2018. Recogym: A reinforcement learning environment for the problem of product recommendation in online advertising. arXiv preprint arXiv:1808.00720 (2018).
[37]
Aghiles Salah, Thanh Binh Tran, and Hady Lauw. 2021. Towards Source-Aligned Variational Models for Cross-Domain Recommendation. In Fifteenth ACM Conference on Recommender Systems. 176--186.
[38]
Badrul Sarwar, George Karypis, Joseph Konstan, and John Riedl. 2001. Item-based collaborative filtering recommendation algorithms. In Proceedings of the 10th international conference on World Wide Web. 285--295.
[39]
David Silver, Aja Huang, Chris J Maddison, Arthur Guez, Laurent Sifre, George Van Den Driessche, Julian Schrittwieser, Ioannis Antonoglou, Veda Panneershelvam, Marc Lanctot, et almbox. 2016. Mastering the game of Go with deep neural networks and tree search. nature, Vol. 529, 7587 (2016), 484--489.
[40]
Ajit P Singh and Geoffrey J Gordon. 2008. Relational learning via collective matrix factorization. In Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining. 650--658.
[41]
Richard S Sutton and Andrew G Barto. 2018. Reinforcement learning: An introduction. MIT press.
[42]
Richard S Sutton, David A McAllester, Satinder P Singh, and Yishay Mansour. 2000. Policy gradient methods for reinforcement learning with function approximation. In Advances in neural information processing systems. 1057--1063.
[43]
Emanuel Todorov, Tom Erez, and Yuval Tassa. 2012. Mujoco: A physics engine for model-based control. In 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems. IEEE, 5026--5033.
[44]
Laurens Van der Maaten and Geoffrey Hinton. 2008. Visualizing data using t-SNE. Journal of machine learning research, Vol. 9, 11 (2008).
[45]
Huazheng Wang, Qingyun Wu, and Hongning Wang. 2017b. Factorization bandits for interactive recommendation. In Thirty-First AAAI Conference on Artificial Intelligence.
[46]
Xiang Wang, Xiangnan He, Liqiang Nie, and Tat-Seng Chua. 2017a. Item silk road: Recommending items from information domains to social users. In Proceedings of the 40th International ACM SIGIR conference on Research and Development in Information Retrieval. 185--194.
[47]
Junda Wu, Canzhe Zhao, Tong Yu, Jingyang Li, and Shuai Li. 2021. Clustering of Conversational Bandits for User Preference Learning and Elicitation. In Proceedings of the 30th ACM International Conference on Information & Knowledge Management. 2129--2139.
[48]
Haifeng Xia, Handong Zhao, and Zhengming Ding. 2021. Adaptive Adversarial Network for Source-Free Domain Adaptation. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 9010--9019.
[49]
Tong Yu, Yilin Shen, and Hongxia Jin. 2019. A visual dialog augmented interactive recommender system. In Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. 157--165.
[50]
Feng Yuan, Lina Yao, and Boualem Benatallah. 2019. DARec: Deep Domain Adaptation for Cross-Domain Recommendation via Transferring Rating Patterns. In Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence, IJCAI-19. International Joint Conferences on Artificial Intelligence Organization, 4227--4233. https://doi.org/10.24963/ijcai.2019/587
[51]
Weinan Zhang, Xiangyu Zhao, Li Zhao, Dawei Yin, Grace Hui Yang, and Alex Beutel. 2020. Deep reinforcement learning for information retrieval: Fundamentals and advances. In Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval. 2468--2471.
[52]
Handong Zhao, Zhengming Ding, and Yun Fu. 2017. Multi-view clustering via deep matrix factorization. In Thirty-first AAAI conference on artificial intelligence.
[53]
Xiangyu Zhao, Long Xia, Liang Zhang, Zhuoye Ding, Dawei Yin, and Jiliang Tang. 2018a. Deep reinforcement learning for page-wise recommendations. In Proceedings of the 12th ACM Conference on Recommender Systems. 95--103.
[54]
Xiangyu Zhao, Liang Zhang, Zhuoye Ding, Long Xia, Jiliang Tang, and Dawei Yin. 2018b. Recommendations with negative feedback via pairwise deep reinforcement learning. In Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. 1040--1048.
[55]
Xiangyu Zhao, Liang Zhang, Zhuoye Ding, Dawei Yin, Yihong Zhao, and Jiliang Tang. 2018c. Deep Reinforcement Learning for Lis t-wise Recommendations. CoRR, Vol. abs/1801.00209 (2018). showeprint[arXiv]1801.00209 http://arxiv.org/abs/1801.00209
[56]
Guanjie Zheng, Fuzheng Zhang, Zihan Zheng, Yang Xiang, Nicholas Jing Yuan, Xing Xie, and Zhenhui Li. 2018. DRN: A deep reinforcement learning framework for news recommendation. In Proceedings of the 2018 World Wide Web Conference. 167--176.
[57]
Sijin Zhou, Xinyi Dai, Haokun Chen, Weinan Zhang, Kan Ren, Ruiming Tang, Xiuqiang He, and Yong Yu. 2020. Interactive recommender system via knowledge graph-enhanced reinforcement learning. In Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval. 179--188.
[58]
Feng Zhu, Yan Wang, Chaochao Chen, Jun Zhou, Longfei Li, and Guanfeng Liu. 2021. Cross-Domain Recommendation: Challenges, Progress, and Prospects. (8 2021), 4721--4728. https://doi.org/10.24963/ijcai.2021/639
[59]
Jun-Yan Zhu, Taesung Park, Phillip Isola, and Alexei A Efros. 2017. Unpaired image-to-image translation using cycle-consistent adversarial networks. In Proceedings of the IEEE international conference on computer vision. 2223--2232.

Cited By

View all
  • (2024)CoRAL: Collaborative Retrieval-Augmented Large Language Models Improve Long-tail RecommendationProceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining10.1145/3637528.3671901(3391-3401)Online publication date: 25-Aug-2024
  • (2024)Mutual Information-based Preference Disentangling and Transferring for Non-overlapped Multi-target Cross-domain RecommendationsProceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/3626772.3657780(2124-2133)Online publication date: 11-Jul-2024
  • (2024)Large Language Models are Learnable Planners for Long-Term RecommendationProceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/3626772.3657683(1893-1903)Online publication date: 11-Jul-2024
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
SIGIR '22: Proceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval
July 2022
3569 pages
ISBN:9781450387323
DOI:10.1145/3477495
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 07 July 2022

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. cross-domain recommendation
  2. interactive recommender systems
  3. reinforcement learning

Qualifiers

  • Research-article

Conference

SIGIR '22
Sponsor:

Acceptance Rates

Overall Acceptance Rate 792 of 3,983 submissions, 20%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)126
  • Downloads (Last 6 weeks)16
Reflects downloads up to 28 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2024)CoRAL: Collaborative Retrieval-Augmented Large Language Models Improve Long-tail RecommendationProceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining10.1145/3637528.3671901(3391-3401)Online publication date: 25-Aug-2024
  • (2024)Mutual Information-based Preference Disentangling and Transferring for Non-overlapped Multi-target Cross-domain RecommendationsProceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/3626772.3657780(2124-2133)Online publication date: 11-Jul-2024
  • (2024)Large Language Models are Learnable Planners for Long-Term RecommendationProceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/3626772.3657683(1893-1903)Online publication date: 11-Jul-2024
  • (2024)Towards Reliable and Efficient Long-Term Recommendation with Large Foundation ModelsCompanion Proceedings of the ACM on Web Conference 202410.1145/3589335.3651258(1190-1193)Online publication date: 13-May-2024
  • (2024)Towards Knowledge-Aware and Deep Reinforced Cross-Domain Recommendation Over Collaborative Knowledge GraphIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2024.339126836:11(7171-7187)Online publication date: Nov-2024
  • (2024)RLISR: A Deep Reinforcement Learning Based Interactive Service Recommendation ModelIEEE Access10.1109/ACCESS.2024.342039512(90204-90217)Online publication date: 2024
  • (2024)IDC-CDR: Cross-domain Recommendation based on Intent Disentanglement and Contrast LearningInformation Processing & Management10.1016/j.ipm.2024.10387161:6(103871)Online publication date: Nov-2024
  • (2024)Cross-Domain Sequential Recommendation with Temporal Encoding and Projection-Based LearningWeb Information Systems Engineering – WISE 202410.1007/978-981-96-0570-5_6(75-90)Online publication date: 30-Nov-2024
  • (2023)Decoupled Progressive Distillation for Sequential Prediction with Interaction DynamicsACM Transactions on Information Systems10.1145/363240342:3(1-35)Online publication date: 29-Dec-2023
  • (2023)Deep reinforcement learning in recommender systemsKnowledge-Based Systems10.1016/j.knosys.2023.110335264:COnline publication date: 15-Mar-2023

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media