skip to main content
10.1145/3394171.3413739acmconferencesArticle/Chapter ViewAbstractPublication PagesmmConference Proceedingsconference-collections
research-article

Task-distribution-aware Meta-learning for Cold-start CTR Prediction

Authors Info & Claims
Published:12 October 2020Publication History

ABSTRACT

Nowadays, click-through rate (CTR) prediction has achieved great success in online advertising. However, making desirable predictions for unseen ads is still challenging, which is known as the cold-start problem. To address such a problem in CTR prediction, meta-learning methods have recently emerged as a popular direction. In these approaches, the predictions for each user/item are regarded as individual tasks, then training a meta-learner on them to implement zero-shot/few-shot learning for unknown tasks. Though these approaches have effectively alleviated the cold-start problem, two facts are not paid enough attention, 1) the diversity of the task difficulty and 2) the perturbation of the task distribution. In this paper, we propose an adaptive loss that ensures the consistency between the task weight and difficulty. Interestingly, the loss function can also be viewed as a description of the worst-case performance under distribution perturbation. Moreover, we develop an algorithm, under the framework of gradient descent with max-oracle (GDmax), to minimize such an adaptive loss. Then we prove the algorithm can return to a stationary point of the adaptive loss. Finally, we implement our method on top of the meta-embedding framework and conduct experiments on three real-world datasets. The experiments show that our proposed method significantly improves the predictions in the cold-start scenario.

Skip Supplemental Material Section

Supplemental Material

3394171.3413739.mp4

mp4

56.1 MB

References

  1. Aharon Ben-Tal and Arkadi Nemirovski. 2002. Robust optimization--methodology and applications. Mathematical Programming, Vol. 92, 3 (2002), 453--480.Google ScholarGoogle ScholarCross RefCross Ref
  2. Stephen Boyd and Lieven Vandenberghe. 2004. Convex optimization. Cambridge university press.Google ScholarGoogle Scholar
  3. Heng-Tze Cheng, Levent Koc, Jeremiah Harmsen, Tal Shaked, Tushar Chandra, Hrishi Aradhye, Glen Anderson, Greg Corrado, Wei Chai, Mustafa Ispir, et almbox. 2016. Wide & deep learning for recommender systems. In workshop on DLRS. 7--10.Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. John M. Danskin. 1966. The Theory of Max-Min, with Applications. SIAM J. Appl. Math., Vol. 14, 4 (1966), 641--664.Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. John Duchi, Shai Shalev-Shwartz, Yoram Singer, and Tushar Chandra. 2008. Efficient projections onto the l1-ball for learning in high dimensions. In ICML. 272--279.Google ScholarGoogle Scholar
  6. Chelsea Finn, Pieter Abbeel, and Sergey Levine. 2017. Model-agnostic meta-learning for fast adaptation of deep networks. In ICML. 1126--1135.Google ScholarGoogle Scholar
  7. Wenjing Fu, Zhaohui Peng, Senzhang Wang, Yang Xu, and Jin Li. 2019. Deeply Fusing Reviews and Contents for Cold Start Users in Cross-Domain Recommendation Systems. In AAAI. 94--101.Google ScholarGoogle Scholar
  8. Quanquan Gu, Jie Zhou, and Chris Ding. 2010. Collaborative filtering: Weighted nonnegative matrix factorization incorporating user and item graphs. In SDM. 199--210.Google ScholarGoogle Scholar
  9. Huifeng Guo, Ruiming Tang, Yunming Ye, Zhenguo Li, and Xiuqiang He. 2017. DeepFM: A Factorization-Machine based Neural Network for CTR Prediction. In IJCAI. 1725--1731.Google ScholarGoogle Scholar
  10. Xiangnan He and Tat-Seng Chua. 2017. Neural factorization machines for sparse predictive analytics. In SIGIR. 355--364.Google ScholarGoogle Scholar
  11. Fuxing Hong, Dongbo Huang, and Ge Chen. 2019. Interaction-Aware Factorization Machines for Recommender Systems. In AAAI. 3804--3811.Google ScholarGoogle Scholar
  12. Liang Hu, Songlei Jian, Longbing Cao, Zhiping Gu, Qingkui Chen, and Artak Amirbekyan. 2019. HERS: Modeling Influential Contexts with Heterogeneous Relations for Sparse and Cold-Start Recommendation. In AAAI. 3830--3837.Google ScholarGoogle Scholar
  13. Yuchin Juan, Damien Lefortier, and Olivier Chapelle. 2017. Field-aware Factorization Machines in a Real-world Online Advertising System. In WWW. 680--688.Google ScholarGoogle Scholar
  14. Diederik P. Kingma and Jimmy Ba. 2015. Adam: A Method for Stochastic Optimization. In ICLR.Google ScholarGoogle Scholar
  15. Liang Lan and Yu Geng. 2019. Accurate and Interpretable Factorization Machines. In AAAI. 4139--4146.Google ScholarGoogle Scholar
  16. Jianxun Lian, Xiaohuan Zhou, Fuzheng Zhang, Zhongxia Chen, Xing Xie, and Guangzhong Sun. 2018. xDeepFM: Combining Explicit and Implicit Feature Interactions for Recommender Systems. In SIGKDD. 1754--1763.Google ScholarGoogle Scholar
  17. Ming Lin, Shuang Qiu, Jieping Ye, Xiaomin Song, Qi Qian, Liang Sun, Shenghuo Zhu, and Rong Jin. 2019 b. Which Factorization Machine Modeling Is Better: A Theoretical Answer with Optimal Guarantee. In AAAI. 4312--4319.Google ScholarGoogle Scholar
  18. Tianyi Lin, Chi Jin, and Michael I Jordan. 2019 a. On Gradient Descent Ascent for Nonconvex-Concave Minimax Problems. arXiv preprint arXiv:1906.00331 (2019).Google ScholarGoogle Scholar
  19. Greg Linden, Brent Smith, and Jeremy York. 2003. Industry Report: Amazon.com Recommendations: Item-to-Item Collaborative Filtering. IEEE Distributed Systems Online, Vol. 4, 1 (2003).Google ScholarGoogle Scholar
  20. Jiawei Liu, Zheng-Jun Zha, Di Chen, Richang Hong, and Meng Wang. 2019. Adaptive transfer network for cross-domain person re-identification. In CVPR. 7202--7211.Google ScholarGoogle Scholar
  21. Weiwen Liu, Ruiming Tang, Jiajin Li, Jinkai Yu, Huifeng Guo, Xiuqiang He, and Shengyu Zhang. 2018. Field-aware probabilistic embedding neural network for CTR prediction. In RecSys. 412--416.Google ScholarGoogle Scholar
  22. Hongseok Namkoong and John C Duchi. 2017. Variance-based regularization with convex objectives. In NIPS. 2971--2980.Google ScholarGoogle Scholar
  23. Wentao Ouyang, Xiuwu Zhang, Li Li, Heng Zou, Xin Xing, Zhaojie Liu, and Yanlong Du. 2019. Deep Spatio-Temporal Neural Networks for Click-Through Rate Prediction. In SIGKDD. 2078--2086.Google ScholarGoogle Scholar
  24. Feiyang Pan, Shuokai Li, Xiang Ao, Pingzhong Tang, and Qing He. 2019. Warm Up Cold-start Advertisements: Improving CTR Predictions via Learning to Learn ID Embeddings. In SIGIR.Google ScholarGoogle Scholar
  25. Junwei Pan, Jian Xu, Alfonso Lobos Ruiz, Wenliang Zhao, Shengjun Pan, Yu Sun, and Quan Lu. 2018. Field-weighted factorization machines for click-through rate prediction in display advertising. In WWW. 1349--1357.Google ScholarGoogle Scholar
  26. Qi Pi, Weijie Bian, Guorui Zhou, Xiaoqiang Zhu, and Kun Gai. 2019. Practice on Long Sequential User Behavior Modeling for Click-Through Rate Prediction. In SIGKDD. ACM, 2671--2679.Google ScholarGoogle Scholar
  27. Surabhi Punjabi and Priyanka Bhatt. 2018. Robust Factorization Machines for User Response Prediction. In WWW. 669--678.Google ScholarGoogle Scholar
  28. Yanru Qu, Han Cai, Kan Ren, Weinan Zhang, Yong Yu, Ying Wen, and Jun Wang. 2016. Product-based neural networks for user response prediction. In ICDM. 1149--1154.Google ScholarGoogle Scholar
  29. Kan Ren, Jiarui Qin, Yuchen Fang, Weinan Zhang, Lei Zheng, Weijie Bian, Guorui Zhou, Jian Xu, Yong Yu, Xiaoqiang Zhu, and Kun Gai. 2019. Lifelong Sequential Modeling with Personalized Memorization for User Response Prediction. In SIGIR. 565--574.Google ScholarGoogle Scholar
  30. Steffen Rendle, Zeno Gantner, Christoph Freudenthaler, and Lars Schmidt-Thieme. 2011. Fast context-aware recommendations with factorization machines. In SIGIR. 635--644.Google ScholarGoogle Scholar
  31. Sujoy Roy and Sharath Chandra Guntuku. 2016. Latent factor representations for cold-start video recommendation. In RecSys. 99--106.Google ScholarGoogle Scholar
  32. Martin Saveski and Amin Mantrach. 2014. Item cold-start recommendations: learning local collective embeddings. In RecSys. 89--96.Google ScholarGoogle Scholar
  33. Andrew I Schein, Alexandrin Popescul, Lyle H Ungar, and David M Pennock. 2002. Methods and metrics for cold-start recommendations. In SIGIR. 253--260.Google ScholarGoogle Scholar
  34. Yanir Seroussi, Fabian Bohnert, and Ingrid Zukerman. 2011. Personalised rating prediction for new users using latent factor models. In HT. 47--56.Google ScholarGoogle Scholar
  35. Weiping Song, Chence Shi, Zhiping Xiao, Zhijian Duan, Yewen Xu, Ming Zhang, and Jian Tang. 2019. AutoInt: Automatic Feature Interaction Learning via Self-Attentive Neural Networks. In CIKM. ACM, 1161--1170.Google ScholarGoogle Scholar
  36. Nitish Srivastava, Geoffrey E. Hinton, Alex Krizhevsky, Ilya Sutskever, and Ruslan Salakhutdinov. 2014. Dropout: a simple way to prevent neural networks from overfitting. JMLR, Vol. 15, 1 (2014), 1929--1958.Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. Alexandre B. Tsybakov. 2009. Introduction to Nonparametric Estimation.Google ScholarGoogle Scholar
  38. Manasi Vartak, Arvind Thiagarajan, Conrado Miranda, Jeshua Bratman, and Hugo Larochelle. 2017. A meta-learning perspective on cold-start recommendations for items. In NIPS. 6904--6914.Google ScholarGoogle Scholar
  39. Ricardo Vilalta and Youssef Drissi. 2002. A perspective view and survey of meta-learning. Artificial intelligence review, Vol. 18, 2 (2002), 77--95.Google ScholarGoogle Scholar
  40. Maksims Volkovs, Guangwei Yu, and Tomi Poutanen. 2017. Dropoutnet: Addressing cold start in recommender systems. In NIPS. 4957--4966.Google ScholarGoogle Scholar
  41. Qianqian Wang, Fang'ai Liu, Shuning Xing, Xiaohui Zhao, and Tianlai Li. 2019. Research on CTR Prediction Based on Deep Learning. IEEE Access (2019), 12779--12789.Google ScholarGoogle Scholar
  42. Ruoxi Wang, Bin Fu, Gang Fu, and Mingliang Wang. 2017. Deep & Cross Network for Ad Click Predictions. In ADKDD, 2017. 12:1--12:7.Google ScholarGoogle Scholar
  43. Jun Xiao, Hao Ye, Xiangnan He, Hanwang Zhang, Fei Wu, and Tat-Seng Chua. 2017. Attentional Factorization Machines: Learning the Weight of Feature Interactions via Attention Networks. In IJCAI. 3119--3125.Google ScholarGoogle Scholar
  44. Mi Zhang, Jie Tang, Xuchen Zhang, and Xiangyang Xue. 2014. Addressing cold start in recommender systems: A semi-supervised co-training algorithm. In SIGIR. 73--82.Google ScholarGoogle Scholar
  45. Weinan Zhang, Tianming Du, and Jun Wang. 2016. Deep Learning over Multi-field Categorical Data - - A Case Study on User Response Prediction. In ECIR. 45--57.Google ScholarGoogle Scholar
  46. Wayne Xin Zhao, Sui Li, Yulan He, Edward Y Chang, Ji-Rong Wen, and Xiaoming Li. 2016. Connecting social media to e-commerce: cold-start product recommendation using microblogging information. IEEE Transactions on Knowledge and Data Engineering, Vol. 28 (2016), 1147--1159.Google ScholarGoogle ScholarDigital LibraryDigital Library
  47. Guorui Zhou, Na Mou, Ying Fan, Qi Pi, Weijie Bian, Chang Zhou, Xiaoqiang Zhu, and Kun Gai. 2019. Deep interest evolution network for click-through rate prediction. In AAAI. 5941--5948.Google ScholarGoogle Scholar
  48. Guorui Zhou, Xiaoqiang Zhu, Chenru Song, Ying Fan, Han Zhu, Xiao Ma, Yanghui Yan, Junqi Jin, Han Li, and Kun Gai. 2018. Deep interest network for click-through rate prediction. In SIGKDD. 1059--1068.Google ScholarGoogle Scholar

Index Terms

  1. Task-distribution-aware Meta-learning for Cold-start CTR Prediction

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Conferences
      MM '20: Proceedings of the 28th ACM International Conference on Multimedia
      October 2020
      4889 pages
      ISBN:9781450379885
      DOI:10.1145/3394171

      Copyright © 2020 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 12 October 2020

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article

      Acceptance Rates

      Overall Acceptance Rate995of4,171submissions,24%

      Upcoming Conference

      MM '24
      MM '24: The 32nd ACM International Conference on Multimedia
      October 28 - November 1, 2024
      Melbourne , VIC , Australia

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader