abstract

Loss Harmonizing for Multi-Scenario CTR Prediction

Authors:
Congcong Liu

JD.com, Hong Kong and ECE, HKUST, Hong Kong

JD.com, Hong Kong and ECE, HKUST, Hong Kong

0000-0002-1749-1075
View Profile

,
Liang Shi

JD.com, China

JD.com, China

0009-0008-0077-8302
View Profile

,
Pei Wang

JD.com, China

JD.com, China

0000-0001-9910-4114
View Profile

,
Fei Teng

JD.com, China

JD.com, China

0000-0002-4507-6864
View Profile

,
Xue Jiang

JD.com, China

JD.com, China

0009-0005-7164-0384
View Profile

,
Changping Peng

JD.com, China

JD.com, China

0009-0002-2561-1919
View Profile

,
Zhangang Lin

JD.com, China

JD.com, China

0000-0003-1379-5044
View Profile

,
Jingping Shao

JD.com, China

JD.com, China

0000-0001-8555-2020
View Profile

RecSys '23: Proceedings of the 17th ACM Conference on Recommender SystemsSeptember 2023Pages 195–199https://doi.org/10.1145/3604915.3608865

Published:14 September 2023Publication History

RecSys '23: Proceedings of the 17th ACM Conference on Recommender Systems

Pages 195–199

ABSTRACT

Large-scale industrial systems often include multiple scenarios to satisfy diverse user needs. The common approach of using one model per scenario does not scale well and not suitable for minor scenarios with limited samples. An solution is to train a model on all scenarios, which can introduce domination and bias from the main scenario. MMoE-like structures have been proposed for multi-scenario prediction, but they do not explicitly address the issue of gradient unbalancing. This work proposes an adaptive loss harmonizing (ALH) algorithm for multi-scenario CTR prediction. It dynamically adjusts the learning speed for balanced training and improved performance. Experiments on real industrial datasets and rigorous A/B testing prove our method’s superiority.

References

Rich Caruana. 1997. Multitask learning. Machine learning 28, 1 (1997), 41–75.Google Scholar
Zhao Chen, Vijay Badrinarayanan, Chen-Yu Lee, and Andrew Rabinovich. 2018. Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In International conference on machine learning. PMLR, 794–803.Google Scholar
Michael Crawshaw. 2020. Multi-task learning with deep neural networks: A survey. arXiv preprint arXiv:2009.09796 (2020).Google Scholar
Yuchen Jiang, Qi Li, Han Zhu, Jinbei Yu, Jin Li, Ziru Xu, Huihui Dong, and Bo Zheng. 2022. Adaptive Domain Interest Network for Multi-domain Recommendation. In Proceedings of the 31st ACM International Conference on Information & Knowledge Management. 3212–3221.Google ScholarDigital Library
Diederik P Kingma and Jimmy Ba. 2014. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014).Google Scholar
Xiao Lin, Hongjie Chen, Changhua Pei, Fei Sun, Xuanji Xiao, Hanxiao Sun, Yongfeng Zhang, Wenwu Ou, and Peng Jiang. 2019. A pareto-efficient algorithm for multiple objective optimization in e-commerce recommendation. In Proceedings of the 13th ACM Conference on recommender systems. 20–28.Google ScholarDigital Library
Congcong Liu, Yuejiang Li, Xiwei Zhao, Changping Peng, Zhangang Lin, and Jingping Shao. 2022. Concept Drift Adaptation for CTR Prediction in Online Advertising Systems. arXiv preprint arXiv:2204.05101 (2022).Google Scholar
Congcong Liu, Yuejiang Li, Jian Zhu, Fei Teng, Xiwei Zhao, Changping Peng, Zhangang Lin, and Jingping Shao. 2022. Position Awareness Modeling with Knowledge Distillation for CTR Prediction. In Proceedings of the 16th ACM Conference on Recommender Systems. 562–566.Google ScholarDigital Library
Congcong Liu, Fei Teng, Xiwei Zhao, Zhangang Lin, Jinghe Hu, and Jingping Shao. 2023. Always Strengthen Your Strengths: A Drift-Aware Incremental Learning Framework for CTR Prediction. In Proceedings of the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval (Taipei, Taiwan) (SIGIR ’23). Association for Computing Machinery, New York, NY, USA, 1806–1810. https://doi.org/10.1145/3539618.3591948Google ScholarDigital Library
Shikun Liu, Edward Johns, and Andrew J Davison. 2019. End-to-end multi-task learning with attention. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 1871–1880.Google ScholarCross Ref
Jiaqi Ma, Zhe Zhao, Xinyang Yi, Jilin Chen, Lichan Hong, and Ed H Chi. 2018. Modeling task relationships in multi-task learning with multi-gate mixture-of-experts. In Proceedings of the 24th ACM SIGKDD international conference on knowledge discovery & data mining. 1930–1939.Google ScholarDigital Library
Sebastian Ruder. 2017. An overview of multi-task learning in deep neural networks. arXiv preprint arXiv:1706.05098 (2017).Google Scholar
Xiang-Rong Sheng, Liqin Zhao, Guorui Zhou, Xinyao Ding, Binding Dai, Qiang Luo, Siran Yang, Jingshan Lv, Chi Zhang, Hongbo Deng, 2021. One model to serve all: Star topology adaptive recommender for multi-domain ctr prediction. In Proceedings of the 30th ACM International Conference on Information & Knowledge Management. 4104–4113.Google ScholarDigital Library
Hongyan Tang, Junning Liu, Ming Zhao, and Xudong Gong. 2020. Progressive layered extraction (ple): A novel multi-task learning (mtl) model for personalized recommendations. In Fourteenth ACM Conference on Recommender Systems. 269–278.Google ScholarDigital Library
Ruoxi Wang, Bin Fu, Gang Fu, and Mingliang Wang. 2017. Deep & cross network for ad click predictions. In Proceedings of the ADKDD. ACM, Halifax, NS, Canada, 12:1–12:7.Google Scholar
Zirui Wang, Yulia Tsvetkov, Orhan Firat, and Yuan Cao. 2020. Gradient vaccine: Investigating and improving multi-task optimization in massively multilingual models. arXiv preprint arXiv:2010.05874 (2020).Google Scholar
Xuanhua Yang, Xiaoyu Peng, Penghui Wei, Shaoguo Liu, Liang Wang, and Bo Zheng. 2022. AdaSparse: Learning Adaptively Sparse Structures for Multi-Domain Click-Through Rate Prediction. In Proceedings of the 31st ACM International Conference on Information & Knowledge Management. 4635–4639.Google ScholarDigital Library
Tianhe Yu, Saurabh Kumar, Abhishek Gupta, Sergey Levine, Karol Hausman, and Chelsea Finn. 2020. Gradient surgery for multi-task learning. Advances in Neural Information Processing Systems 33 (2020), 5824–5836.Google Scholar
Guorui Zhou, Xiaoqiang Zhu, Chengru Song, Ying Fan, Han Zhu, Xiao Ma, Yanghui Yan, Junqi Jin, Han Li, and Kun Gai. 2018. Deep interest network for click-through rate prediction. In Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining (KDD). ACM, London, UK, 1059–1068.Google ScholarDigital Library

Recommendations

ADL: Adaptive Distribution Learning Framework for Multi-Scenario CTR Prediction
SIGIR '23: Proceedings of the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval

Large-scale commercial platforms usually involve numerous business scenarios for diverse business strategies. To provide click-through rate (CTR) predictions for multiple scenarios simultaneously, existing promising multi-scenario models explicitly ...
Read More
Scenario Networks for Software Specification and Scenario Management
Read More
OptMSM: Optimizing Multi-Scenario Modeling for Click-Through Rate Prediction
Machine Learning and Knowledge Discovery in Databases: Applied Data Science and Demo Track
Abstract
A large-scale industrial recommendation platform typically consists of multiple associated scenarios, requiring a unified click-through rate (CTR) prediction model to serve them simultaneously. Existing approaches for multi-scenario CTR prediction ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
RecSys '23: Proceedings of the 17th ACM Conference on Recommender Systems
September 2023
1406 pages
ISBN:9798400702419
DOI:10.1145/3604915
Editors:
Jie Zhang,
Li Chen,
Shlomo Berkovsky,
Min Zhang,
Tommaso di Noia,
Justin Basilico,
Luiz Pizzato,
Yang Song
Copyright © 2023 Owner/Author
Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the Owner/Author.
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 14 September 2023
Check for updates
Qualifiers
- abstract
- Research
- Refereed limited
Conference

Acceptance Rates
Overall Acceptance Rate254of1,295submissions,20%
Upcoming Conference
RecSys '24

Sponsor:

sigchi

18th ACM Conference on Recommender Systems

October 14 - 18, 2024

Bari , Italy
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 0
  Total Citations
  View Citations
- 298
  Total Downloads
- Downloads (Last 12 months)298
- Downloads (Last 6 weeks)15
Other Metrics
View Author Metrics
Cited By
This publication has not been cited yet

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format .

View HTML Format

Loss Harmonizing for Multi-Scenario CTR Prediction

RecSys '23: Proceedings of the 17th ACM Conference on Recommender Systems

ABSTRACT

References

Cited By

Recommendations

ADL: Adaptive Distribution Learning Framework for Multi-Scenario CTR Prediction

Scenario Networks for Software Specification and Scenario Management

OptMSM: Optimizing Multi-Scenario Modeling for Click-Through Rate Prediction