ABSTRACT
Nowadays, click-through rate (CTR) prediction has achieved great success in online advertising. However, making desirable predictions for unseen ads is still challenging, which is known as the cold-start problem. To address such a problem in CTR prediction, meta-learning methods have recently emerged as a popular direction. In these approaches, the predictions for each user/item are regarded as individual tasks, then training a meta-learner on them to implement zero-shot/few-shot learning for unknown tasks. Though these approaches have effectively alleviated the cold-start problem, two facts are not paid enough attention, 1) the diversity of the task difficulty and 2) the perturbation of the task distribution. In this paper, we propose an adaptive loss that ensures the consistency between the task weight and difficulty. Interestingly, the loss function can also be viewed as a description of the worst-case performance under distribution perturbation. Moreover, we develop an algorithm, under the framework of gradient descent with max-oracle (GDmax), to minimize such an adaptive loss. Then we prove the algorithm can return to a stationary point of the adaptive loss. Finally, we implement our method on top of the meta-embedding framework and conduct experiments on three real-world datasets. The experiments show that our proposed method significantly improves the predictions in the cold-start scenario.
Supplemental Material
Available for Download
There is a file named sup.pdf in this zip. It consists of theoretical and some experimental results.
- Aharon Ben-Tal and Arkadi Nemirovski. 2002. Robust optimization--methodology and applications. Mathematical Programming, Vol. 92, 3 (2002), 453--480.Google ScholarCross Ref
- Stephen Boyd and Lieven Vandenberghe. 2004. Convex optimization. Cambridge university press.Google Scholar
- Heng-Tze Cheng, Levent Koc, Jeremiah Harmsen, Tal Shaked, Tushar Chandra, Hrishi Aradhye, Glen Anderson, Greg Corrado, Wei Chai, Mustafa Ispir, et almbox. 2016. Wide & deep learning for recommender systems. In workshop on DLRS. 7--10.Google ScholarDigital Library
- John M. Danskin. 1966. The Theory of Max-Min, with Applications. SIAM J. Appl. Math., Vol. 14, 4 (1966), 641--664.Google ScholarDigital Library
- John Duchi, Shai Shalev-Shwartz, Yoram Singer, and Tushar Chandra. 2008. Efficient projections onto the l1-ball for learning in high dimensions. In ICML. 272--279.Google Scholar
- Chelsea Finn, Pieter Abbeel, and Sergey Levine. 2017. Model-agnostic meta-learning for fast adaptation of deep networks. In ICML. 1126--1135.Google Scholar
- Wenjing Fu, Zhaohui Peng, Senzhang Wang, Yang Xu, and Jin Li. 2019. Deeply Fusing Reviews and Contents for Cold Start Users in Cross-Domain Recommendation Systems. In AAAI. 94--101.Google Scholar
- Quanquan Gu, Jie Zhou, and Chris Ding. 2010. Collaborative filtering: Weighted nonnegative matrix factorization incorporating user and item graphs. In SDM. 199--210.Google Scholar
- Huifeng Guo, Ruiming Tang, Yunming Ye, Zhenguo Li, and Xiuqiang He. 2017. DeepFM: A Factorization-Machine based Neural Network for CTR Prediction. In IJCAI. 1725--1731.Google Scholar
- Xiangnan He and Tat-Seng Chua. 2017. Neural factorization machines for sparse predictive analytics. In SIGIR. 355--364.Google Scholar
- Fuxing Hong, Dongbo Huang, and Ge Chen. 2019. Interaction-Aware Factorization Machines for Recommender Systems. In AAAI. 3804--3811.Google Scholar
- Liang Hu, Songlei Jian, Longbing Cao, Zhiping Gu, Qingkui Chen, and Artak Amirbekyan. 2019. HERS: Modeling Influential Contexts with Heterogeneous Relations for Sparse and Cold-Start Recommendation. In AAAI. 3830--3837.Google Scholar
- Yuchin Juan, Damien Lefortier, and Olivier Chapelle. 2017. Field-aware Factorization Machines in a Real-world Online Advertising System. In WWW. 680--688.Google Scholar
- Diederik P. Kingma and Jimmy Ba. 2015. Adam: A Method for Stochastic Optimization. In ICLR.Google Scholar
- Liang Lan and Yu Geng. 2019. Accurate and Interpretable Factorization Machines. In AAAI. 4139--4146.Google Scholar
- Jianxun Lian, Xiaohuan Zhou, Fuzheng Zhang, Zhongxia Chen, Xing Xie, and Guangzhong Sun. 2018. xDeepFM: Combining Explicit and Implicit Feature Interactions for Recommender Systems. In SIGKDD. 1754--1763.Google Scholar
- Ming Lin, Shuang Qiu, Jieping Ye, Xiaomin Song, Qi Qian, Liang Sun, Shenghuo Zhu, and Rong Jin. 2019 b. Which Factorization Machine Modeling Is Better: A Theoretical Answer with Optimal Guarantee. In AAAI. 4312--4319.Google Scholar
- Tianyi Lin, Chi Jin, and Michael I Jordan. 2019 a. On Gradient Descent Ascent for Nonconvex-Concave Minimax Problems. arXiv preprint arXiv:1906.00331 (2019).Google Scholar
- Greg Linden, Brent Smith, and Jeremy York. 2003. Industry Report: Amazon.com Recommendations: Item-to-Item Collaborative Filtering. IEEE Distributed Systems Online, Vol. 4, 1 (2003).Google Scholar
- Jiawei Liu, Zheng-Jun Zha, Di Chen, Richang Hong, and Meng Wang. 2019. Adaptive transfer network for cross-domain person re-identification. In CVPR. 7202--7211.Google Scholar
- Weiwen Liu, Ruiming Tang, Jiajin Li, Jinkai Yu, Huifeng Guo, Xiuqiang He, and Shengyu Zhang. 2018. Field-aware probabilistic embedding neural network for CTR prediction. In RecSys. 412--416.Google Scholar
- Hongseok Namkoong and John C Duchi. 2017. Variance-based regularization with convex objectives. In NIPS. 2971--2980.Google Scholar
- Wentao Ouyang, Xiuwu Zhang, Li Li, Heng Zou, Xin Xing, Zhaojie Liu, and Yanlong Du. 2019. Deep Spatio-Temporal Neural Networks for Click-Through Rate Prediction. In SIGKDD. 2078--2086.Google Scholar
- Feiyang Pan, Shuokai Li, Xiang Ao, Pingzhong Tang, and Qing He. 2019. Warm Up Cold-start Advertisements: Improving CTR Predictions via Learning to Learn ID Embeddings. In SIGIR.Google Scholar
- Junwei Pan, Jian Xu, Alfonso Lobos Ruiz, Wenliang Zhao, Shengjun Pan, Yu Sun, and Quan Lu. 2018. Field-weighted factorization machines for click-through rate prediction in display advertising. In WWW. 1349--1357.Google Scholar
- Qi Pi, Weijie Bian, Guorui Zhou, Xiaoqiang Zhu, and Kun Gai. 2019. Practice on Long Sequential User Behavior Modeling for Click-Through Rate Prediction. In SIGKDD. ACM, 2671--2679.Google Scholar
- Surabhi Punjabi and Priyanka Bhatt. 2018. Robust Factorization Machines for User Response Prediction. In WWW. 669--678.Google Scholar
- Yanru Qu, Han Cai, Kan Ren, Weinan Zhang, Yong Yu, Ying Wen, and Jun Wang. 2016. Product-based neural networks for user response prediction. In ICDM. 1149--1154.Google Scholar
- Kan Ren, Jiarui Qin, Yuchen Fang, Weinan Zhang, Lei Zheng, Weijie Bian, Guorui Zhou, Jian Xu, Yong Yu, Xiaoqiang Zhu, and Kun Gai. 2019. Lifelong Sequential Modeling with Personalized Memorization for User Response Prediction. In SIGIR. 565--574.Google Scholar
- Steffen Rendle, Zeno Gantner, Christoph Freudenthaler, and Lars Schmidt-Thieme. 2011. Fast context-aware recommendations with factorization machines. In SIGIR. 635--644.Google Scholar
- Sujoy Roy and Sharath Chandra Guntuku. 2016. Latent factor representations for cold-start video recommendation. In RecSys. 99--106.Google Scholar
- Martin Saveski and Amin Mantrach. 2014. Item cold-start recommendations: learning local collective embeddings. In RecSys. 89--96.Google Scholar
- Andrew I Schein, Alexandrin Popescul, Lyle H Ungar, and David M Pennock. 2002. Methods and metrics for cold-start recommendations. In SIGIR. 253--260.Google Scholar
- Yanir Seroussi, Fabian Bohnert, and Ingrid Zukerman. 2011. Personalised rating prediction for new users using latent factor models. In HT. 47--56.Google Scholar
- Weiping Song, Chence Shi, Zhiping Xiao, Zhijian Duan, Yewen Xu, Ming Zhang, and Jian Tang. 2019. AutoInt: Automatic Feature Interaction Learning via Self-Attentive Neural Networks. In CIKM. ACM, 1161--1170.Google Scholar
- Nitish Srivastava, Geoffrey E. Hinton, Alex Krizhevsky, Ilya Sutskever, and Ruslan Salakhutdinov. 2014. Dropout: a simple way to prevent neural networks from overfitting. JMLR, Vol. 15, 1 (2014), 1929--1958.Google ScholarDigital Library
- Alexandre B. Tsybakov. 2009. Introduction to Nonparametric Estimation.Google Scholar
- Manasi Vartak, Arvind Thiagarajan, Conrado Miranda, Jeshua Bratman, and Hugo Larochelle. 2017. A meta-learning perspective on cold-start recommendations for items. In NIPS. 6904--6914.Google Scholar
- Ricardo Vilalta and Youssef Drissi. 2002. A perspective view and survey of meta-learning. Artificial intelligence review, Vol. 18, 2 (2002), 77--95.Google Scholar
- Maksims Volkovs, Guangwei Yu, and Tomi Poutanen. 2017. Dropoutnet: Addressing cold start in recommender systems. In NIPS. 4957--4966.Google Scholar
- Qianqian Wang, Fang'ai Liu, Shuning Xing, Xiaohui Zhao, and Tianlai Li. 2019. Research on CTR Prediction Based on Deep Learning. IEEE Access (2019), 12779--12789.Google Scholar
- Ruoxi Wang, Bin Fu, Gang Fu, and Mingliang Wang. 2017. Deep & Cross Network for Ad Click Predictions. In ADKDD, 2017. 12:1--12:7.Google Scholar
- Jun Xiao, Hao Ye, Xiangnan He, Hanwang Zhang, Fei Wu, and Tat-Seng Chua. 2017. Attentional Factorization Machines: Learning the Weight of Feature Interactions via Attention Networks. In IJCAI. 3119--3125.Google Scholar
- Mi Zhang, Jie Tang, Xuchen Zhang, and Xiangyang Xue. 2014. Addressing cold start in recommender systems: A semi-supervised co-training algorithm. In SIGIR. 73--82.Google Scholar
- Weinan Zhang, Tianming Du, and Jun Wang. 2016. Deep Learning over Multi-field Categorical Data - - A Case Study on User Response Prediction. In ECIR. 45--57.Google Scholar
- Wayne Xin Zhao, Sui Li, Yulan He, Edward Y Chang, Ji-Rong Wen, and Xiaoming Li. 2016. Connecting social media to e-commerce: cold-start product recommendation using microblogging information. IEEE Transactions on Knowledge and Data Engineering, Vol. 28 (2016), 1147--1159.Google ScholarDigital Library
- Guorui Zhou, Na Mou, Ying Fan, Qi Pi, Weijie Bian, Chang Zhou, Xiaoqiang Zhu, and Kun Gai. 2019. Deep interest evolution network for click-through rate prediction. In AAAI. 5941--5948.Google Scholar
- Guorui Zhou, Xiaoqiang Zhu, Chenru Song, Ying Fan, Han Zhu, Xiao Ma, Yanghui Yan, Junqi Jin, Han Li, and Kun Gai. 2018. Deep interest network for click-through rate prediction. In SIGKDD. 1059--1068.Google Scholar
Index Terms
- Task-distribution-aware Meta-learning for Cold-start CTR Prediction
Recommendations
FORM: Follow the Online Regularized Meta-Leader for Cold-Start Recommendation
SIGIR '21: Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information RetrievalMeta-learning based recommendation systems alleviate the cold-start problem through a bi-level meta-optimization process. Recommendation borrows prior experience from pre-trained static system-level parameters and fine-tunes the model in user-level for ...
Warm Up Cold-start Advertisements: Improving CTR Predictions via Learning to Learn ID Embeddings
SIGIR'19: Proceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information RetrievalClick-through rate (CTR) prediction has been one of the most central problems in computational advertising. Lately, embedding techniques that produce low-dimensional representations of ad IDs drastically improve CTR prediction accuracies. However, such ...
Task Similarity Aware Meta Learning for Cold-Start Recommendation
CIKM '22: Proceedings of the 31st ACM International Conference on Information & Knowledge ManagementIn recommender systems, content-based methods and meta-learning involved methods usually have been adopted to alleviate the item cold-start problem. The former consider utilizing item attributes at the feature level and the latter aim at learning a ...
Comments