skip to main content
10.1145/3331184.3331268acmconferencesArticle/Chapter ViewAbstractPublication PagesirConference Proceedingsconference-collections
research-article

Warm Up Cold-start Advertisements: Improving CTR Predictions via Learning to Learn ID Embeddings

Published: 18 July 2019 Publication History

Abstract

Click-through rate (CTR) prediction has been one of the most central problems in computational advertising. Lately, embedding techniques that produce low-dimensional representations of ad IDs drastically improve CTR prediction accuracies. However, such learning techniques are data demanding and work poorly on new ads with little logging data, which is known as the cold-start problem.
In this paper, we aim to improve CTR predictions during both the cold-start phase and the warm-up phase when a new ad is added to the candidate pool. We propose Meta-Embedding, a meta-learning-based approach that learns to generate desirable initial embeddings for new ad IDs. The proposed method trains an embedding generator for new ad IDs by making use of previously learned ads through gradient-based meta-learning. In other words, our method learns how to learn better embeddings. When a new ad comes, the trained generator initializes the embedding of its ID by feeding its contents and attributes. Next, the generated embedding can speed up the model fitting during the warm-up phase when a few labeled examples are available, compared to the existing initialization methods.
Experimental results on three real-world datasets showed that Meta-Embedding can significantly improve both the cold-start and warm-up performances for six existing CTR prediction models, ranging from lightweight models such as Factorization Machines to complicated deep models such as PNN and DeepFM. All of the above apply to conversion rate (CVR) predictions as well.

Supplementary Material

MP4 File (cite2-12h00-d3.mp4)

References

[1]
Martín Abadi, Paul Barham, Jianmin Chen, Zhifeng Chen, Andy Davis, Jeffrey Dean, Matthieu Devin, Sanjay Ghemawat, Geoffrey Irving, Michael Isard, et al. 2016. Tensorflow: A system for large-scale machine learning. In 12th USENIX Symposium on Operating Systems Design and Implementation. 265--283.
[2]
Stéphane Caron and Smriti Bhagat. 2013. Mixing bandits: A recipe for improved cold-start recommendations in a social network. In Proceedings of the 7th Workshop on Social Network Mining and Analysis. ACM, 11.
[3]
Heng-Tze Cheng, Levent Koc, Jeremiah Harmsen, Tal Shaked, Tushar Chandra, Hrishi Aradhye, Glen Anderson, Greg Corrado, Wei Chai, Mustafa Ispir, et al. 2016. Wide & deep learning for recommender systems. In Proceedings of the 1st Workshop on Deep Learning for Recommender Systems. ACM, 7--10.
[4]
Janghoon Choi, Junseok Kwon, and Kyoung Mu Lee. 2017. Deep Meta Learning for Real-Time Visual Tracking based on Target-Specific Feature Space. arXiv preprint arXiv:1712.09153 (2017).
[5]
Chelsea Finn, Pieter Abbeel, and Sergey Levine. 2017. Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks. In International Conference on Machine Learning. 1126--1135.
[6]
Nadav Golbandi, Yehuda Koren, and Ronny Lempel. 2011. Adaptive bootstrapping of recommender systems using decision trees. In Proceedings of the fourth ACM international conference on Web search and data mining. ACM, 595--604.
[7]
Erin Grant, Chelsea Finn, Sergey Levine, Trevor Darrell, and Thomas Griffiths. 2018. Recasting gradient-based meta-learning as hierarchical bayes. arXiv preprint arXiv:1801.08930 (2018).
[8]
Quanquan Gu, Jie Zhou, and Chris Ding. 2010. Collaborative filtering: Weighted nonnegative matrix factorization incorporating user and item graphs. In Proceedings of the 2010 SIAM international conference on data mining. SIAM, 199--210.
[9]
Huifeng Guo, Ruiming Tang, Yunming Ye, Zhenguo Li, and Xiuqiang He. 2017. DeepFM: a factorization-machine based neural network for CTR prediction. In Proceedings of the 26th International Joint Conference on Artificial Intelligence. AAAI Press, 1725--1731.
[10]
Abhay S. Harpale and Yiming Yang. 2008. Personalized active learning for collaborative filtering. In Proceedings of the 31st international ACM SIGIR conference on Research and development in information retrieval. ACM, 91--98.
[11]
Xiangnan He and Tat-Seng Chua. 2017. Neural factorization machines for sparse predictive analytics. In Proceedings of the 40th International ACM SIGIR conference on Research and Development in Information Retrieval. ACM, 355--364.
[12]
Xiangnan He, Jinhui Tang, Xiaoyu Du, Richang Hong, Tongwei Ren, and Tat-Seng Chua. 2019. Fast Matrix Factorization With Nonuniform Weights on Missing Data. IEEE transactions on neural networks and learning systems (2019).
[13]
Yuchin Juan, Damien Lefortier, and Olivier Chapelle. 2017. Field-aware factorization machines in a real-world online advertising system. In Proceedings of the 26th International Conference on World Wide Web Companion. International World Wide Web Conferences Steering Committee, 680--688.
[14]
Douwe Kiela, Changhan Wang, and Kyunghyun Cho. 2018. Dynamic Meta-Embeddings for Improved Sentence Representations. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing. 1466--1477.
[15]
László Kozma, Alexander Ilin, and Tapani Raiko. 2009. Binary principal component analysis in the Netflix collaborative filtering task. In IEEE International Workshop on Machine Learning for Signal Processing, 2009. IEEE, 1--6.
[16]
Brenden M. Lake, Ruslan Salakhutdinov, and Joshua B. Tenenbaum. 2015. Human-level concept learning through probabilistic program induction. Science, Vol. 350, 6266 (2015), 1332--1338.
[17]
Lihong Li, Wei Chu, John Langford, and Robert E. Schapire. 2010. A contextual-bandit approach to personalized news article recommendation. In Proceedings of the 19th international conference on World wide web. ACM, 661--670.
[18]
Jovian Lin, Kazunari Sugiyama, Min-Yen Kan, and Tat-Seng Chua. 2013. Addressing cold-start in app recommendation: latent user models constructed from twitter followers. In Proceedings of the 36th international ACM SIGIR conference on Research and development in information retrieval. ACM, 283--292.
[19]
Kaixiang Mo, Bo Liu, Lei Xiao, Yong Li, and Jie Jiang. 2015. Image Feature Learning for Cold Start Problem in Display Advertising. In IJCAI. 3728--3734.
[20]
Hai Thanh Nguyen, Jérémie Mary, and Philippe Preux. 2014. Cold-start problems in recommendation systems via contextual-bandit algorithms. arXiv preprint arXiv:1405.7544 (2014).
[21]
Feiyang Pan, Qingpeng Cai, Pingzhong Tang, Fuzhen Zhuang, and Qing He. 2019. Policy Gradients for Contextual Recommendations. In The World Wide Web Conference (WWW '19). ACM, New York, NY, USA, 1421--1431.
[22]
Feiyang Pan, Qingpeng Cai, An-Xiang Zeng, Chun-Xiang Pan, Qing Da, Hualin He, Qing He, and Pingzhong Tang. 2018. Policy Optimization with Model-based Explorations. arXiv preprint arXiv:1811.07350 (2018).
[23]
Seung-Taek Park, David Pennock, Omid Madani, Nathan Good, and Dennis DeCoste. 2006. Naïve filterbots for robust cold-start recommendations. In Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining. ACM, 699--705.
[24]
Yanru Qu, Han Cai, Kan Ren, Weinan Zhang, Yong Yu, Ying Wen, and Jun Wang. 2016. Product-based neural networks for user response prediction. In IEEE 16th International Conference on Data Mining (ICDM). IEEE, 1149--1154.
[25]
Steffen Rendle. 2010. Factorization machines. In IEEE 10th International Conference on Data Mining (ICDM). IEEE, 995--1000.
[26]
Steffen Rendle, Zeno Gantner, Christoph Freudenthaler, and Lars Schmidt-Thieme. 2011. Fast context-aware recommendations with factorization machines. In Proceedings of the 34th international ACM SIGIR conference on Research and development in Information Retrieval. ACM, 635--644.
[27]
Sujoy Roy and Sharath Chandra Guntuku. 2016. Latent factor representations for cold-start video recommendation. In Proceedings of the 10th ACM Conference on Recommender Systems. ACM, 99--106.
[28]
Badrul Sarwar, George Karypis, Joseph Konstan, and John Riedl. 2002. Incremental singular value decomposition algorithms for highly scalable recommender systems. In Fifth International Conference on Computer and Information Science. Citeseer, 27--28.
[29]
Martin Saveski and Amin Mantrach. 2014. Item cold-start recommendations: learning local collective embeddings. In Proceedings of the 8th ACM Conference on Recommender systems. ACM, 89--96.
[30]
Andrew I. Schein, Alexandrin Popescul, Lyle H. Ungar, and David M. Pennock. 2002. Methods and metrics for cold-start recommendations. In Proceedings of the 25th annual international ACM SIGIR conference on Research and development in information retrieval. ACM, 253--260.
[31]
Yanir Seroussi, Fabian Bohnert, and Ingrid Zukerman. 2011. Personalised rating prediction for new users using latent factor models. In Proceedings of the 22nd ACM conference on Hypertext and hypermedia. ACM, 47--56.
[32]
Parikshit Shah, Ming Yang, Sachidanand Alle, Adwait Ratnaparkhi, Ben Shahshahani, and Rohit Chandra. 2017. A practical exploration system for search advertising. In Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 1625--1631.
[33]
Gábor Takács, István Pilászy, Bottyán Németh, and Domonkos Tikk. 2008. Investigation of various matrix factorization methods for large recommender systems. In IEEE International Conference on Data Mining Workshops, 2008. IEEE, 553--562.
[34]
Liang Tang, Yexi Jiang, Lei Li, Chunqiu Zeng, and Tao Li. 2015. Personalized recommendation via parameter-free contextual bandits. In Proceedings of the 38th International ACM SIGIR Conference on Research and Development in Information Retrieval. ACM, 323--332.
[35]
Joaquin Vanschoren. 2018. Meta-Learning: A Survey. arXiv preprint arXiv:1810.03548 (2018).
[36]
Manasi Vartak, Arvind Thiagarajan, Conrado Miranda, Jeshua Bratman, and Hugo Larochelle. 2017. A Meta-Learning Perspective on Cold-Start Recommendations for Items. In Advances in Neural Information Processing Systems. 6904--6914.
[37]
Ricardo Vilalta and Youssef Drissi. 2002. A perspective view and survey of meta-learning. Artificial Intelligence Review, Vol. 18, 2 (2002), 77--95.
[38]
Maksims Volkovs, Guangwei Yu, and Tomi Poutanen. 2017. DropoutNet: Addressing Cold Start in Recommender Systems. In Advances in Neural Information Processing Systems. 4957--4966.
[39]
Maksims Volkovs and Guang Wei Yu. 2015. Effective latent models for binary feedback in recommender systems. In Proceedings of the 38th International ACM SIGIR Conference on Research and Development in Information Retrieval. ACM, 313--322.
[40]
Hu Xu, Bing Liu, Lei Shu, and Philip S. Yu. 2018. Lifelong Domain Word Embedding via Meta-Learning. arXiv preprint arXiv:1805.09991 (2018).
[41]
Jingwei Xu, Yuan Yao, Hanghang Tong, Xianping Tao, and Jian Lu. 2017. R a P are: A Generic Strategy for Cold-Start Rating Prediction Problem. IEEE Transactions on Knowledge and Data Engineering, Vol. 29, 6 (2017), 1296--1309.
[42]
Yuan Yao, Hanghang Tong, Guo Yan, Feng Xu, Xiang Zhang, Boleslaw K. Szymanski, and Jian Lu. 2014. Dual-regularized one-class collaborative filtering. In Proceedings of the 23rd ACM International Conference on Information and Knowledge Management. ACM, 759--768.
[43]
Wenpeng Yin and Hinrich Schütze. 2016. Learning Word Meta-Embeddings. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics, Vol. 1. 1351--1360.
[44]
Mi Zhang, Jie Tang, Xuchen Zhang, and Xiangyang Xue. 2014. Addressing cold start in recommender systems: A semi-supervised co-training algorithm. In Proceedings of the 37th international ACM SIGIR conference on Research & development in information retrieval. ACM, 73--82.
[45]
Wayne Xin Zhao, Sui Li, Yulan He, Edward Y. Chang, Ji-Rong Wen, and Xiaoming Li. 2016. Connecting social media to e-commerce: Cold-start product recommendation using microblogging information. IEEE Transactions on Knowledge and Data Engineering, Vol. 28, 5 (2016), 1147--1159.
[46]
Guorui Zhou, Xiaoqiang Zhu, Chenru Song, Ying Fan, Han Zhu, Xiao Ma, Yanghui Yan, Junqi Jin, Han Li, and Kun Gai. 2018. Deep interest network for click-through rate prediction. In Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. ACM, 1059--1068.
[47]
Ke Zhou, Shuang-Hong Yang, and Hongyuan Zha. 2011. Functional matrix factorizations for cold-start recommendation. In Proceedings of the 34th international ACM SIGIR conference on Research and development in Information Retrieval. ACM, 315--324.

Cited By

View all
  • (2025)C2lRec: Causal Contrastive Learning for User Cold-start Recommendation with Social VariableACM Transactions on Information Systems10.1145/3711858Online publication date: 9-Jan-2025
  • (2025)Large Language Model Simulator for Cold-Start RecommendationProceedings of the Eighteenth ACM International Conference on Web Search and Data Mining10.1145/3701551.3703546(261-270)Online publication date: 10-Mar-2025
  • (2025)Meta Recommendation With Robustness ImprovementIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2024.350941637:2(781-793)Online publication date: Feb-2025
  • Show More Cited By

Index Terms

  1. Warm Up Cold-start Advertisements: Improving CTR Predictions via Learning to Learn ID Embeddings

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    SIGIR'19: Proceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval
    July 2019
    1512 pages
    ISBN:9781450361729
    DOI:10.1145/3331184
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 18 July 2019

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. cold-start
    2. ctr prediction
    3. meta-learning
    4. online advertising

    Qualifiers

    • Research-article

    Funding Sources

    • Alibaba Innovative Research
    • CCF-Tencent Rhino-Bird Young Faculty Open Research Fund
    • National Key Research and Development Program of China
    • Ant Financial
    • Youth Innovation Promotion Association CAS
    • National Natural Science Foundation of China
    • China Youth 1000-talent program

    Conference

    SIGIR '19
    Sponsor:

    Acceptance Rates

    SIGIR'19 Paper Acceptance Rate 84 of 426 submissions, 20%;
    Overall Acceptance Rate 792 of 3,983 submissions, 20%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)173
    • Downloads (Last 6 weeks)13
    Reflects downloads up to 28 Feb 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2025)C2lRec: Causal Contrastive Learning for User Cold-start Recommendation with Social VariableACM Transactions on Information Systems10.1145/3711858Online publication date: 9-Jan-2025
    • (2025)Large Language Model Simulator for Cold-Start RecommendationProceedings of the Eighteenth ACM International Conference on Web Search and Data Mining10.1145/3701551.3703546(261-270)Online publication date: 10-Mar-2025
    • (2025)Meta Recommendation With Robustness ImprovementIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2024.350941637:2(781-793)Online publication date: Feb-2025
    • (2025)Comprehensive Review of Meta-Learning Methods for Cold-Start Issue in Recommendation SystemsIEEE Access10.1109/ACCESS.2025.353602513(24622-24641)Online publication date: 2025
    • (2025)Explainable robo-advisor: An online learning framework for new investors without trading recordsNeurocomputing10.1016/j.neucom.2025.129463(129463)Online publication date: Jan-2025
    • (2025)Meta doubly robust: Debiasing CVR prediction via meta-learning with a small amount of unbiased dataKnowledge-Based Systems10.1016/j.knosys.2024.112898310(112898)Online publication date: Feb-2025
    • (2024)RGMeta: Enhancing Cold-Start Recommendations with a Residual Graph Meta-Embedding ModelElectronics10.3390/electronics1317347313:17(3473)Online publication date: 1-Sep-2024
    • (2024)Recommendation as Instruction Following: A Large Language Model Empowered Recommendation ApproachACM Transactions on Information Systems10.1145/3708882Online publication date: 20-Dec-2024
    • (2024)Prompt Tuning for Item Cold-start RecommendationProceedings of the 18th ACM Conference on Recommender Systems10.1145/3640457.3688126(411-421)Online publication date: 8-Oct-2024
    • (2024)MARec: Metadata Alignment for cold-start RecommendationProceedings of the 18th ACM Conference on Recommender Systems10.1145/3640457.3688125(401-410)Online publication date: 8-Oct-2024
    • Show More Cited By

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media