research-article

One Model to Serve All: Star Topology Adaptive Recommender for Multi-Domain CTR Prediction

Authors:

Xiang-Rong Sheng,

Xiaoqiang ZhuAuthors Info & Claims

CIKM '21: Proceedings of the 30th ACM International Conference on Information & Knowledge Management

Pages 4104 - 4113

https://doi.org/10.1145/3459637.3481941

Published: 30 October 2021 Publication History

Abstract

Traditional industry recommendation systems usually use data in a single domain to train models and then serve the domain. However, a large-scale commercial platform often contains multiple domains, and its recommendation system often needs to make click-through rate (CTR) predictions for multiple domains. Generally, different domains may share some common user groups and items, and each domain may have its own unique user groups and items. Moreover, even the same user may have different behaviors in different domains. In order to leverage all the data from different domains, a single model can be trained to serve all domains. However, it is difficult for a single model to capture the characteristics of various domains and serve all domains well. On the other hand, training an individual model for each domain separately does not fully use the data from all domains. In this paper, we propose the Star Topology Adaptive Recommender (STAR) model to train a single model to serve all domains by leveraging data from all domains simultaneously, capturing the characteristics of each domain, and modeling the commonalities between different domains. Essentially, the net- work of each domain consists of two factorized networks: one centered network shared by all domains and the domain-specific network tailored for each domain. For each domain, we combine these two factorized networks and generate a unified network by element-wise multiplying the weights of the shared network and those of the domain-specific network, although these two factorized networks can be combined using other functions, which is open for further research. Most importantly, STAR can learn the shared network from all the data and adapt domain-specific parameters according to the characteristics of each domain. The experimental results from production data validate the superiority of the proposed STAR model. Since late 2020, STAR has been deployed in the display advertising system of Alibaba, obtaining 8.0% improvement on CTR and 6.0% increase on RPM (Revenue Per Mille).

References

[1]

Andreas Argyriou, Theodoros Evgeniou, and Massimiliano Pontil. 2008. Convex multi-task feature learning. Machine Learning, Vol. 73, 3 (2008), 243--272.

Digital Library

[2]

Lei Jimmy Ba, Jamie Ryan Kiros, and Geoffrey E. Hinton. 2016. Layer Normalization. CoRR, Vol. abs/1607.06450 (2016).

[3]

Shai Ben-David, John Blitzer, Koby Crammer, Alex Kulesza, Fernando Pereira, and Jennifer Wortman Vaughan. 2010. A theory of learning from different domains. Machine Learning, Vol. 79, 1--2 (2010), 151--175.

Digital Library

[4]

Steffen Bickel, Michael Brü ckner, and Tobias Scheffer. 2007. Discriminative learning for differing training and test distributions. In Proceedings of the 24th International Conference on Machine Learning, Vol. 227. 81--88.

Digital Library

[5]

Rich Caruana. 1998. Multitask Learning. In Learning to Learn. 95--133.

Digital Library

[6]

Heng-Tze Cheng, Levent Koc, Jeremiah Harmsen, Tal Shaked, Tushar Chandra, Hrishi Aradhye, Glen Anderson, Greg Corrado, Wei Chai, Mustafa Ispir, et al. 2016. Wide & deep learning for recommender systems. In Proceedings of the 1st Workshop on Deep Learning for Recommender Systems. ACM, 7--10.

Digital Library

[7]

Ronan Collobert and Jason Weston. 2008. A Unified Architecture for Natural Language Processing: Deep Neural Networks with Multitask Learning. In Proceedings of the 25-th International Conference on Machine Learning, William W. Cohen, Andrew McCallum, and Sam T. Roweis (Eds.), Vol. 307. 160--167.

Digital Library

[8]

Li Deng, Geoffrey E. Hinton, and Brian Kingsbury. 2013. New Types of Deep Neural Network Learning for Speech Recognition and Related Applications: An Overview. In Proceedings of the 2013 IEEE International Conference on Acoustics, Speech and Signal Processing. 8599--8603.

[9]

Mark Dredze, Alex Kulesza, and Koby Crammer. 2010. Multi-Domain Learning by Confidence-Weighted Parameter Combination. Maching Learning, Vol. 79, 1--2 (2010), 123--149.

Digital Library

[10]

Yufei Feng, Fuyu Lv, Weichen Shen, Menghan Wang, Fei Sun, Yu Zhu, and Keping Yang. 2019. Deep Session Interest Network for Click-Through Rate Prediction. In Proceedings of the 28th International Joint Conference on Artificial Intelligence. 2301--2307.

Digital Library

[11]

Jerome H Friedman. 2001. Greedy function approximation: a gradient boosting machine. Annals of statistics (2001), 1189--1232.

[12]

Chuan Guo, Geoff Pleiss, Yu Sun, and Kilian Q. Weinberger. 2017a. On Calibration of Modern Neural Networks. In Proceedings of the 34th International Conference on Machine Learning (Proceedings of Machine Learning Research, Vol. 70). 1321--1330.

Digital Library

[13]

Huifeng Guo, Ruiming Tang, Yunming Ye, Zhenguo Li, and Xiuqiang He. 2017b. Deepfm: a factorization-machine based neural network for ctr prediction. In Proceedings of the 26th International Joint Conference on Artificial Intelligence. Melbourne, Australia., 2782--2788.

[14]

Sergey Ioffe and Christian Szegedy. 2015. Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift. In Proceedings of the 32nd International Conference on Machine Learning, Vol. 37. 448--456.

Digital Library

[15]

Biye Jiang, Chao Deng, Huimin Yi, Zelin Hu, Guorui Zhou, Yang Zheng, Sui Huang, Xinyang Guo, Dongyue Wang, Yue Song, et al. 2019. XDL: An Industrial Deep Learning Framework for High-Dimensional Sparse Data. In Proceedings of the 1st International Workshop on Deep Learning Practice for High-Dimensional Sparse Data. 1--9.

Digital Library

[16]

Mahesh Joshi, Mark Dredze, William W. Cohen, and Carolyn Penstein Rosé. 2012. Multi-Domain Learning: When Do Domains Matter?. In Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning. 1302--1312.

Digital Library

[17]

Alex Kendall, Yarin Gal, and Roberto Cipolla. 2018. Multi-task learning using uncertainty to weigh losses for scene geometry and semantics. 7482--7491.

[18]

Diederik P. Kingma and Jimmy Ba. 2015. Adam: A Method for Stochastic Optimization. In Proceedings of the 3rd International Conference on Learning Representations.

[19]

Yehuda Koren. 2008. Factorization meets the neighborhood: a multifaceted collaborative filtering model. In Proceedings of the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. Las Vegas, Nevada, USA, 426--434.

Digital Library

[20]

Yehuda Koren, Robert M. Bell, and Chris Volinsky. 2009. Matrix Factorization Techniques for Recommender Systems. IEEE Computer, Vol. 42, 8 (2009), 30--37.

Digital Library

[21]

Chao Li, Zhiyuan Liu, Mengmeng Wu, Yuchi Xu, Huan Zhao, Pipei Huang, Guoliang Kang, Qiwei Chen, Wei Li, and Dik Lun Lee. 2019. Multi-Interest Network with Dynamic Routing for Recommendation at Tmall. In Proceedings of the 28th ACM International Conference on Information and Knowledge Management. 2615--2623.

Digital Library

[22]

Pengcheng Li, Runze Li, Qing Da, Anxiang Zeng, and Lijun Zhang. 2020. Improving Multi-Scenario Learning to Rank in E-commerce by Exploiting Task Relationships in the Label Space. In Proceedings of The 29th ACM International Conference on Information and Knowledge Management0. 2605--2612.

Digital Library

[23]

Jianxun Lian, Xiaohuan Zhou, Fuzheng Zhang, Zhongxia Chen, Xing Xie, and Guangzhong Sun. 2018. xDeepFM: Combining Explicit and Implicit Feature Interactions for Recommender Systems. In Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. London, United Kingdom.

Digital Library

[24]

Jiaqi Ma, Zhe Zhao, Jilin Chen, Ang Li, Lichan Hong, and Ed H. Chi. 2019. SNR: Sub-Network Routing for Flexible Parameter Sharing in Multi-Task Learning. In Proceedings of The 33rd AAAI Conference on Artificial Intelligence. 216--223.

[25]

Jiaqi Ma, Zhe Zhao, Xinyang Yi, Jilin Chen, Lichan Hong, and Ed H. Chi. 2018b. Modeling Task Relationships in Multi-task Learning with Multi-gate Mixture-of-Experts. In Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. 1930--1939.

Digital Library

[26]

Xiao Ma, Liqin Zhao, Guan Huang, Zhi Wang, Zelin Hu, Xiaoqiang Zhu, and Kun Gai. 2018a. Entire Space Multi-Task Model: An Effective Approach for Estimating Post-Click Conversion Rate. In The 41st International ACM SIGIR Conference on Research & Development in Information Retrieval. 1137--1140.

Digital Library

[27]

Ishan Misra, Abhinav Shrivastava, Abhinav Gupta, and Martial Hebert. 2016. Cross-Stitch Networks for Multi-task Learning. 3994--4003.

[28]

Qi Pi, Weijie Bian, Guorui Zhou, Xiaoqiang Zhu, and Kun Gai. 2019. Practice on Long Sequential User Behavior Modeling for Click-through Rate Prediction. In Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. 1059--1068.

Digital Library

[29]

Qi Pi, Guorui Zhou, Yujing Zhang, Zhe Wang, Lejian Ren, Ying Fan, Xiaoqiang Zhu, and Kun Gai. 2020. Search-based User Interest Modeling with Lifelong Sequential Behavior Data for Click-Through Rate Prediction. In Proceeding of The 29th ACM International Conference on Information and Knowledge Management. 2685--2692.

Digital Library

[30]

Yanru Qu, Han Cai, Kan Ren, Weinan Zhang, Yong Yu, Ying Wen, and Jun Wang. 2016. Product-based neural networks for user response prediction. In Proceedings of the16th International Conference on Data Mining. IEEE, 1149--1154.

[31]

Alec Radford, Luke Metz, and Soumith Chintala. 2016. Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks. In Proceedings of the 4th International Conference on Learning Representations.

[32]

Steffen Rendle. 2010. Factorization machines. In Proceedings of the 10th International Conference on Data Mining. IEEE, 995--1000.

Digital Library

[33]

Sebastian Ruder. 2017. An Overview of Multi-Task Learning in Deep Neural Networks. CoRR, Vol. abs/1706.05098 (2017).

[34]

Alice Schoenauer Sebag, Louise Heinrich, Marc Schoenauer, Michèle Sebag, Lani F. Wu, and Steven J. Altschuler. 2019. Multi-Domain Adversarial Learning. In Proceedings of the 7th International Conference on Learning Representations.

[35]

Ozan Sener and Vladlen Koltun. 2018. Multi-Task Learning as Multi-Objective Optimization. In Advances in Neural Information Processing Systems 31. 525--536.

Digital Library

[36]

Hongyan Tang, Junning Liu, Ming Zhao, and Xudong Gong. 2020. Progressive Layered Extraction (PLE): A Novel Multi-Task Learning (MTL) Model for Personalized Recommendations. In Proceedings of the 14th ACM Conference on Recommender Systems. 269--278.

Digital Library

[37]

Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Lukasz Kaiser, and Illia Polosukhin. 2017. Attention is All you Need. In Advances in Neural Information Processing Systems 30. 5998--6008.

Digital Library

[38]

Hong Wen, Jing Zhang, Yuan Wang, Fuyu Lv, Wentian Bao, Quan Lin, and Keping Yang. 2020. Entire Space Multi-Task Modeling via Post-Click Behavior Decomposition for Conversion Rate Prediction. In Proceedings of the 43rd International ACM SIGIR conference on research and development in Information Retrieval. 2377--2386.

Digital Library

[39]

Zhibo Xiao, Luwei Yang, Wen Jiang, Yi Wei, Yi Hu, and Hao Wang. 2020. Deep Multi-Interest Network for Click-through Rate Prediction. In Proceedings of the 29th ACM International Conference on Information and Knowledge Management. 2265--2268.

Digital Library

[40]

Yongxin Yang and Timothy M. Hospedales. 2015. A Unified Perspective on Multi-Domain and Multi-Task Learning. In Proceeding of the 3rd International Conference on Learning Representations.

[41]

Fajie Yuan, Guoxiao Zhang, Alexandros Karatzoglou, Xiangnan He, Joemon Jose, Beibei Kong, and Yudong Li. 2020. One Person, One Model, One World: Learning Continual User Representation without Forgetting. CoRR, Vol. abs/2009.13724 (2020). arxiv: 2009.13724 https://arxiv.org/abs/2009.13724

[42]

Guorui Zhou, Na Mou, Ying Fan, Qi Pi, Weijie Bian, Chang Zhou, Xiaoqiang Zhu, and Kun Gai. 2019. Deep Interest Evolution Network for Click-Through Rate Prediction. In Proceedings of the 33rd AAAI Conference on Artificial Intelligence. Honolulu, Hawaii, USA, 5941--5948.

Digital Library

[43]

Guorui Zhou, Xiaoqiang Zhu, Chenru Song, Ying Fan, Han Zhu, Xiao Ma, Yanghui Yan, Junqi Jin, Han Li, and Kun Gai. 2018. Deep interest network for click-through rate prediction. In Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. ACM, 1059--1068.

Digital Library

[44]

Yunhong Zhou, Dennis Wilkinson, Robert Schreiber, and Rong Pan. 2008. Large-scale parallel collaborative filtering for the netflix prize. In Proceedings of the International Conference on Algorithmic Applications in Management. Springer, 337--348.

Digital Library

[45]

Han Zhu, Junqi Jin, Chang Tan, Fei Pan, Yifan Zeng, Han Li, and Kun Gai. 2017. Optimized Cost per Click in Taobao Display Advertising. In Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 2191--2200.

Digital Library

Cited By

Liu QQiu ZZhao XWu XZhang ZXu TTian F(2025)A Contrastive Pretrain Model with Prompt Tuning for Multi-center Medication RecommendationACM Transactions on Information Systems10.1145/370663143:3(1-29)Online publication date: 3-Jan-2025
https://dl.acm.org/doi/10.1145/3706631
Meng FZhang ZNejdl WAuer SCha MMoens MNajork M(2025)AMLCDR: An Adaptive Meta-Learning Model for Cross-Domain Recommendation by Aligning Preference DistributionsProceedings of the Eighteenth ACM International Conference on Web Search and Data Mining10.1145/3701551.3703539(606-615)Online publication date: 10-Mar-2025
https://dl.acm.org/doi/10.1145/3701551.3703539
Ding YJi YCai XXin XLu YHuang SLiu CGao XMurata TLu HNejdl WAuer SCha MMoens MNajork M(2025)Towards Personalized Federated Multi-Scenario Multi-Task RecommendationProceedings of the Eighteenth ACM International Conference on Web Search and Data Mining10.1145/3701551.3703523(429-438)Online publication date: 10-Mar-2025
https://dl.acm.org/doi/10.1145/3701551.3703523
Show More Cited By

Index Terms

One Model to Serve All: Star Topology Adaptive Recommender for Multi-Domain CTR Prediction
1. Information systems
  1. Information retrieval

Recommendations

Adaptive Domain Interest Network for Multi-domain Recommendation
CIKM '22: Proceedings of the 31st ACM International Conference on Information & Knowledge Management

Industrial recommender systems usually hold data from multiple business scenarios and are expected to provide recommendation services for these scenarios simultaneously. In the retrieval step, the topK high-quality items selected from a large number of ...
PEPNet: Parameter and Embedding Personalized Network for Infusing with Personalized Prior Information
KDD '23: Proceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining

With the increase of content pages and interactive buttons in online services such as online-shopping and video-watching websites, industrial-scale recommender systems face challenges in multi-domain and multi-task recommendations. The core of multi-...
A Unified Framework for Multi-Domain CTR Prediction via Large Language Models
Multi-Domain Click-Through Rate (MDCTR) prediction is crucial for online recommendation platforms, which involves providing personalized recommendation services to users in different domains. However, current MDCTR models are confronted with the following ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

CIKM '21: Proceedings of the 30th ACM International Conference on Information & Knowledge Management

October 2021

4966 pages

ISBN:9781450384469

DOI:10.1145/3459637

General Chairs:
Gianluca Demartini
The University of Queensland, Australia
,
Guido Zuccon
The University of Queensland, Australia
,
Program Chairs:
J. Shane Culpepper
RMIT University, Australia
,
Zi Huang
The University of Queensland, Australia
,
Hanghang Tong
University of Illinois at Urbana-Champaign, USA

Copyright © 2021 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 30 October 2021

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Conference

CIKM '21

Sponsor:

CIKM '21: The 30th ACM International Conference on Information and Knowledge Management

November 1 - 5, 2021

Queensland, Virtual Event, Australia

Acceptance Rates

Overall Acceptance Rate 1,861 of 8,427 submissions, 22%

Upcoming Conference

CIKM '25

Sponsor:
sigir
sigir

The 34th ACM International Conference on Information and Knowledge Management

November 10 - 14, 2025

Seoul , Republic of Korea

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

104
Total Citations
View Citations
1,222
Total Downloads

Downloads (Last 12 months)211
Downloads (Last 6 weeks)29

Reflects downloads up to 27 Feb 2025

Other Metrics

View Author Metrics

Citations

Cited By

Liu QQiu ZZhao XWu XZhang ZXu TTian F(2025)A Contrastive Pretrain Model with Prompt Tuning for Multi-center Medication RecommendationACM Transactions on Information Systems10.1145/370663143:3(1-29)Online publication date: 3-Jan-2025
https://dl.acm.org/doi/10.1145/3706631
Meng FZhang ZNejdl WAuer SCha MMoens MNajork M(2025)AMLCDR: An Adaptive Meta-Learning Model for Cross-Domain Recommendation by Aligning Preference DistributionsProceedings of the Eighteenth ACM International Conference on Web Search and Data Mining10.1145/3701551.3703539(606-615)Online publication date: 10-Mar-2025
https://dl.acm.org/doi/10.1145/3701551.3703539
Ding YJi YCai XXin XLu YHuang SLiu CGao XMurata TLu HNejdl WAuer SCha MMoens MNajork M(2025)Towards Personalized Federated Multi-Scenario Multi-Task RecommendationProceedings of the Eighteenth ACM International Conference on Web Search and Data Mining10.1145/3701551.3703523(429-438)Online publication date: 10-Mar-2025
https://dl.acm.org/doi/10.1145/3701551.3703523
Yi QWu LTang JZeng YSong Z(2025)Hybrid contrastive multi-scenario learning for multi-task sequential-dependence recommendationNeural Networks10.1016/j.neunet.2024.106953183(106953)Online publication date: Mar-2025
https://doi.org/10.1016/j.neunet.2024.106953
Wu YFu RXing TYu ZYin F(2025)A user behavior-aware multi-task learning model for enhanced short video recommendationNeurocomputing10.1016/j.neucom.2024.129076617(129076)Online publication date: Feb-2025
https://doi.org/10.1016/j.neucom.2024.129076
Cai XLu YLu HDing Y(2025)Towards Mixture of Task-Intensive Experts for Multi-task RecommendationDatabase Systems for Advanced Applications10.1007/978-981-97-5555-4_22(323-332)Online publication date: 12-Jan-2025
https://doi.org/10.1007/978-981-97-5555-4_22
Jia PWang YLin SLi XZhao XGuo HTang RWooldridge MDy JNatarajan S(2024)D3Proceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence and Thirty-Sixth Conference on Innovative Applications of Artificial Intelligence and Fourteenth Symposium on Educational Advances in Artificial Intelligence10.1609/aaai.v38i8.28699(8553-8561)Online publication date: 20-Feb-2024
https://dl.acm.org/doi/10.1609/aaai.v38i8.28699
Zhang YZhang ZWu YSun YZhuang FYu WHu LLi HGai KAn ZXu YCai JKankanhalli MPrabhakaran BBoll SSubramanian RZheng LSingh VCesar PXie LXu D(2024)Tag Tree-Guided Multi-grained Alignment for Multi-Domain Short Video RecommendationProceedings of the 32nd ACM International Conference on Multimedia10.1145/3664647.3681692(5683-5691)Online publication date: 28-Oct-2024
https://dl.acm.org/doi/10.1145/3664647.3681692
Zhu FYang XLi LZhou JCai JKankanhalli MPrabhakaran BBoll SSubramanian RZheng LSingh VCesar PXie LXu D(2024)An Active Masked Attention Framework for Many-to-Many Cross-Domain RecommendationsProceedings of the 32nd ACM International Conference on Multimedia10.1145/3664647.3681435(9680-9689)Online publication date: 28-Oct-2024
https://dl.acm.org/doi/10.1145/3664647.3681435
Song DYang EGuo GShen LJiang LWang X(2024)Multi-Scenario and Multi-Task Aware Feature Interaction for Recommendation SystemACM Transactions on Knowledge Discovery from Data10.1145/365131218:6(1-20)Online publication date: 12-Apr-2024
https://dl.acm.org/doi/10.1145/3651312
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Table of Conten