skip to main content
10.1145/3616855.3635756acmconferencesArticle/Chapter ViewAbstractPublication PageswsdmConference Proceedingsconference-collections
research-article

Follow the LIBRA: Guiding Fair Policy for Unified Impression Allocation via Adversarial Rewarding

Published: 04 March 2024 Publication History

Abstract

The diverse advertiser demands (brand effects or immediate outcomes) lead to distinct selling (pre-agreed volumes with an under-delivery penalty or compete per auction) and pricing (fixed prices or varying bids) patterns in Guaranteed delivery (GD) and real-time bidding (RTB) advertising. This necessitates fair impression allocation to unify the two markets for promoting ad content diversity and overall revenue. Existing approaches often deprive RTB ads of equal exposure opportunities by prioritizing GD ads, and coarse-grained methods are inferior to 1) Ambiguous reward due to varied objectives and constraints of GD fulfillment and RTB utility, hindering measurement of each allocation's contribution to the global interests; 2) Intensified competition by the coexistence of GD and RTB ads, complicating their mutual relationships; 3) Policy degradation caused by evolving user traffic and bid landscape, requiring adaptivity to distribution shifts.
We propose LIBRA, a generative-adversarial framework that unifies GD and RTB ads through request-level modeling. To guide the generative allocator, we solve convex optimization on historical data to derivehindsight optimal allocations that balance fairness and utility. We then train a discriminator to distinguish the generated actions from these solvedlatent expert policy's demonstrations, providing an integrated reward to align LIBRA with the optimal fair policy. LIBRA employs a self-attention encoder to capture the competitive relations among varying amounts of candidate ads per allocation. Further, it enhances the discriminator withinformation bottlenecks-based summarizer against overfitting to irrelevant distractors in the ad environment. LIBRA adopts a decoupled structure, where theoffline discriminator continuously fine-tunes with newly-coming allocations and periodically guides theonline allocation policy's updates to accommodate online dynamics. LIBRA has been deployed on the Tencent advertising system for over four months, with extensive experiments conducted. Online A/B tests demonstrate significant lifts in ad income (3.17%), overall click-through rate (1.56%), and cost-per-mille (3.20%), contributing a daily revenue increase of hundreds of thousands of RMB.

References

[1]
Santiago R Balseiro, Jon Feldman, Vahab Mirrokni, and Shan Muthukrishnan. 2014. Yield Optimization of Display Advertising with Ad Exchange. Management Science, Vol. 60, 12 (2014), 2886--2907.
[2]
Vijay Bharadwaj, Peiji Chen, Wenjing Ma, Chandrashekhar Nagarajan, John Tomlin, Sergei Vassilvitskii, Erik Vee, and Jian Yang. 2012. SHALE: An Efficient Algorithm for Allocation of Guaranteed Display Advertising. In Proceedings of the 18th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 1195--1203.
[3]
Stephen P Boyd and Lieven Vandenberghe. 2004. Convex optimization. Cambridge university press.
[4]
Niv Buchbinder, Moran Feldman, Arpita Ghosh, and Joseph Naor. 2014. Frequency capping in online advertising. Journal of Scheduling, Vol. 17, 4 (2014), 385--398.
[5]
Han Cai, Kan Ren, Weinan Zhang, Kleanthis Malialis, Jun Wang, Yong Yu, and Defeng Guo. 2017. Real-time bidding by reinforcement learning in display advertising. In Proceedings of the tenth ACM international conference on web search and data mining. 661--670.
[6]
Bowei Chen. 2016. Risk-aware Dynamic Reserve Prices of Programmatic Guarantee in Display Advertising. In Proceedings of the 16th IEEE International Conference on Data Mining Workshops (ICDMW). 511--518.
[7]
Bowei Chen, Shuai Yuan, and Jun Wang. 2014. A Dynamic Pricing Model for Unifying Programmatic Guarantee and Real-time Bidding in Display Advertising. In Proceedings of the 8th International Workshop on Data Mining for Online Advertising. 1--9.
[8]
Minmin Chen, Alex Beutel, Paul Covington, Sagar Jain, Francois Belletti, and Ed H Chi. 2019. Top-k off-policy correction for a REINFORCE recommender system. In Proceedings of the Twelfth ACM International Conference on Web Search and Data Mining. 456--464.
[9]
Peiji Chen, Wenjing Ma, Srinath Mandalapu, Chandrashekhar Nagarjan, Jayavel Shanmugasundaram, Sergei Vassilvitskii, Erik Vee, Manfai Yu, and Jason Zien. 2012. Ad Serving Using a Compact Allocation Plan. In Proceedings of the 13th ACM Conference on Electronic Commerce. 319--336.
[10]
Ye Chen, Pavel Berkhin, Bo Anderson, and Nikhil R Devanur. 2011. Real-time Bidding Algorithms for Performance-based Display Ad Allocation. In Proceedings of the 17th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 1307--1315.
[11]
Xiao Cheng, Chuanren Liu, Liang Dai, Peng Zhang, Zhen Fang, and Zhonglin Zu. 2022. An Adaptive Unified Allocation Framework for Guaranteed Display Advertising. In Proceedings of the Fifteenth ACM International Conference on Web Search and Data Mining. 132--140.
[12]
Liang Dai, Zhonglin Zu, Hao Wu, Liang Wang, and Bo Zheng. 2023. Fairness-aware Guaranteed Display Advertising Allocation under Traffic Cost Constraint. In Proceedings of the ACM Web Conference 2023. 3572--3580.
[13]
Dentsu. 2022. Global Ad Spend Forecast. https://www.dentsu.com
[14]
Nikhil R Devanur and Thomas P Hayes. 2009. The Adwords Problem: Online Keyword Matching with Budgeted Bidders under Random Permutations. In Proceedings of the 10th ACM Conference on Electronic Commerce. 71--78.
[15]
Benjamin Edelman, Michael Ostrovsky, and Michael Schwarz. 2007. Internet advertising and the generalized second-price auction: Selling billions of dollars worth of keywords. American economic review, Vol. 97, 1 (2007), 242--259.
[16]
Zhen Fang, Yang Li, Chuanren Liu, Wenxiang Zhu, Yu Zheng, and Wenjun Zhou. 2019. Large-Scale Personalized Delivery for Guaranteed Display Advertising with Real-Time Pacing. In IEEE International Conference on Data Mining (ICDM). 190--199.
[17]
Justin Fu, Katie Luo, and Sergey Levine. 2017. Learning robust rewards with adversarial inverse reinforcement learning. arXiv preprint arXiv:1710.11248 (2017).
[18]
Yingqiang Ge, Xiaoting Zhao, Lucia Yu, Saurabh Paul, Diane Hu, Chu-Cheng Hsieh, and Yongfeng Zhang. 2022. Toward Pareto efficient fairness-utility trade-off in recommendation through reinforcement learning. In Proceedings of the fifteenth ACM international conference on web search and data mining. 316--324.
[19]
Arpita Ghosh, Preston McAfee, Kishore Papineni, and Sergei Vassilvitskii. 2009. Bidding for Representative Allocations for Display Advertising. In Internet and Network Economics, Stefano Leonardi (Ed.). Springer Berlin Heidelberg, Berlin, Heidelberg, 208--219.
[20]
Ian Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil Ozair, Aaron Courville, and Yoshua Bengio. 2020. Generative adversarial networks. Commun. ACM, Vol. 63, 11 (2020), 139--144.
[21]
Yue He, Xiujun Chen, Di Wu, Junwei Pan, Qing Tan, Chuan Yu, Jian Xu, and Xiaoqiang Zhu. 2021. A unified solution to constrained bidding in online display advertising. In Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining. 2993--3001.
[22]
Jonathan Ho and Stefano Ermon. 2016. Generative adversarial imitation learning. Advances in neural information processing systems, Vol. 29 (2016).
[23]
Ali Hojjat, John Turner, Suleyman Cetintas, and Jian Yang. 2014. Delivering guaranteed display ads under reach and frequency requirements. In Twenty-Eighth AAAI Conference on Artificial Intelligence.
[24]
Grégoire Jauvion and Nicolas Grislain. 2018. Optimal allocation of real-time-bidding and direct campaigns. In Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. 416--424.
[25]
Junqi Jin, Chengru Song, Han Li, Kun Gai, Jun Wang, and Weinan Zhang. 2018. Real-time bidding with multi-agent reinforcement learning in display advertising. In Proceedings of the 27th ACM international conference on information and knowledge management. 2193--2201.
[26]
Diederik P Kingma and Max Welling. 2013. Auto-encoding variational bayes. arXiv preprint arXiv:1312.6114 (2013).
[27]
Kan Ren, Jiarui Qin, Lei Zheng, Zhengyu Yang, Weinan Zhang, and Yong Yu. 2019. Deep Landscape Forecasting for Real-time Bidding Advertising. In Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. 363--372.
[28]
Konstantin Salomatin, Tie-Yan Liu, and Yiming Yang. 2012. A Unified Optimization Framework for Auction and Guaranteed Delivery in Online Advertising. In Proceedings of the 21st ACM International Conference on Information and Knowledge Management. 2005--2009.
[29]
John Schulman, Sergey Levine, Pieter Abbeel, Michael Jordan, and Philipp Moritz. 2015. Trust region policy optimization. In International conference on machine learning. PMLR, 1889--1897.
[30]
John Schulman, Filip Wolski, Prafulla Dhariwal, Alec Radford, and Oleg Klimov. 2017. Proximal policy optimization algorithms. arXiv preprint arXiv:1707.06347 (2017).
[31]
Richard S Sutton, David McAllester, Satinder Singh, and Yishay Mansour. 1999. Policy gradient methods for reinforcement learning with function approximation. Advances in neural information processing systems, Vol. 12 (1999).
[32]
Naftali Tishby and Noga Zaslavsky. 2015. Deep learning and the information bottleneck principle. In 2015 ieee information theory workshop (itw). IEEE, 1--5.
[33]
Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, Łukasz Kaiser, and Illia Polosukhin. 2017. Attention is all you need. Advances in neural information processing systems, Vol. 30 (2017).
[34]
Erik Vee, Sergei Vassilvitskii, and Jayavel Shanmugasundaram. 2010. Optimal Online Assignment with Forecasts. In Proceedings of the 11th ACM Conference on Electronic Commerce. 109--118.
[35]
XiaoYu Wang, YongHui Guo, Xiaoyang Ma, Dongbo Huang, Lan Xu, Haisheng Tan, Hao Zhou, and Xiang-Yang Li. 2023. CLOCK: Online Temporal Hierarchical Framework for Multi-Scale Multi-Granularity Forecasting of User Impression. In Proceedings of the 32nd ACM International Conference on Information and Knowledge Management (Birmingham, United Kingdom) (CIKM '23). Association for Computing Machinery, New York, NY, USA, 2544--2553. https://doi.org/10.1145/3583780.3614810
[36]
XiaoYu Wang, Bin Tan, Yonghui Guo, Tao Yang, Dongbo Huang, Lan Xu, Nikolaos M Freris, Hao Zhou, and Xiang-Yang Li. 2022. CONFLUX: A Request-level Fusion Framework for Impression Allocation via Cascade Distillation. In Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining. 4070--4078.
[37]
Di Wu, Cheng Chen, Xiujun Chen, Junwei Pan, Xun Yang, Qing Tan, Jian Xu, and Kuang-Chih Lee. 2021. Impression Allocation and Policy Search in Display Advertising. In 2021 IEEE International Conference on Data Mining (ICDM). 749--756.
[38]
Di Wu, Xiujun Chen, Xun Yang, Hao Wang, Qing Tan, Xiaoxun Zhang, Jian Xu, and Kun Gai. 2018. Budget Constrained Bidding by Model-free Reinforcement Learning in Display Advertising. In Proceedings of the 27th ACM International Conference on Information and Knowledge Management. 1443--1451.
[39]
Haizhi Yang, Tengyun Wang, Xiaoli Tang, Qianyu Li, Yueyue Shi, Siyu Jiang, Han Yu, and Hengjie Song. 2021. Multi-task learning for bias-free joint ctr prediction and market price modeling in online advertising. In Proceedings of the 30th ACM International Conference on Information & Knowledge Management. 2291--2300.
[40]
Jian Yang, Erik Vee, Sergei Vassilvitskii, John Tomlin, Jayavel Shanmugasundaram, Tasos Anastasakos, and Oliver Kennedy. 2010. Inventory Allocation for Online Graphical Display Advertising. arXiv preprint arXiv:1008.3551 (2010).
[41]
Wei Zhang, Brendan Kitts, Yanjun Han, Zhengyuan Zhou, Tingyu Mao, Hao He, Shengjun Pan, Aaron Flores, San Gultekin, and Tsachy Weissman. 2021. Meow: A space-efficient nonparametric bid shading algorithm. In Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining. 3928--3936.
[42]
Xiangyu Zhao, Changsheng Gu, Haoshenglun Zhang, Xiwang Yang, Xiaobing Liu, Jiliang Tang, and Hui Liu. 2021. Dear: Deep reinforcement learning for online advertising impression in recommender systems. In Proceedings of the AAAI conference on artificial intelligence, Vol. 35. 750--758.
[43]
Xiangyu Zhao, Liang Zhang, Zhuoye Ding, Long Xia, Jiliang Tang, and Dawei Yin. 2018. Recommendations with negative feedback via pairwise deep reinforcement learning. In Proceedings of the 24th ACM SIGKDD international conference on knowledge discovery & data mining. 1040--1048.
[44]
Xiangyu Zhao, Xudong Zheng, Xiwang Yang, Xiaobing Liu, and Jiliang Tang. 2020. Jointly learning to recommend and advertise. In Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. 3319--3327.
[45]
Guorui Zhou, Na Mou, Ying Fan, Qi Pi, Weijie Bian, Chang Zhou, Xiaoqiang Zhu, and Kun Gai. 2019. Deep interest evolution network for click-through rate prediction. In Proceedings of the AAAI conference on artificial intelligence, Vol. 33. 5941--5948.
[46]
Guorui Zhou, Xiaoqiang Zhu, Chenru Song, Ying Fan, Han Zhu, Xiao Ma, Yanghui Yan, Junqi Jin, Han Li, and Kun Gai. 2018. Deep interest network for click-through rate prediction. In Proceedings of the 24th ACM SIGKDD international conference on knowledge discovery & data mining. 1059--1068.
[47]
Tian Zhou, Hao He, Shengjun Pan, Niklas Karlsson, Bharatbhushan Shetty, Brendan Kitts, Djordje Gligorijevic, San Gultekin, Tingyu Mao, Junwei Pan, et al. 2021. An Efficient Deep Distribution Network for Bid Shading in First-Price Auctions. In Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining. 3996--4004. io

Index Terms

  1. Follow the LIBRA: Guiding Fair Policy for Unified Impression Allocation via Adversarial Rewarding

      Recommendations

      Comments

      Information & Contributors

      Information

      Published In

      cover image ACM Conferences
      WSDM '24: Proceedings of the 17th ACM International Conference on Web Search and Data Mining
      March 2024
      1246 pages
      ISBN:9798400703713
      DOI:10.1145/3616855
      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

      Sponsors

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 04 March 2024

      Permissions

      Request permissions for this article.

      Check for updates

      Author Tags

      1. display advertising
      2. generative adversarial network
      3. imitation learning
      4. reinforcement learning

      Qualifiers

      • Research-article

      Funding Sources

      • Key Research Program of Frontier Sciences, CAS
      • China National Natural Science Foundation
      • ?Pioneer? and ?Leading Goose? R&D Program of Zhejiang

      Conference

      WSDM '24

      Acceptance Rates

      Overall Acceptance Rate 498 of 2,863 submissions, 17%

      Upcoming Conference

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • 0
        Total Citations
      • 137
        Total Downloads
      • Downloads (Last 12 months)137
      • Downloads (Last 6 weeks)23
      Reflects downloads up to 07 Mar 2025

      Other Metrics

      Citations

      View Options

      Login options

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Figures

      Tables

      Media

      Share

      Share

      Share this Publication link

      Share on social media