skip to main content
10.1145/3640457.3688058acmconferencesArticle/Chapter ViewAbstractPublication PagesrecsysConference Proceedingsconference-collections
extended-abstract

Off-Policy Selection for Optimizing Ad Display Timing in Mobile Games (Samsung Instant Plays)

Published: 08 October 2024 Publication History

Abstract

Off-Policy Selection (OPS) aims to select the best policy from a set of policies trained using offline Reinforcement Learning. In this work, we describe our custom OPS method and its successful application in Samsung Instant Plays for optimizing ad delivery timings. The motivation behind proposing our custom OPS method is the fact that traditional Off-Policy Evaluation (OPE) methods often exhibit enormous variance leading to unreliable results. We applied our OPS method to initialize policies for our custom pseudo-online training pipeline. The final policy resulted in a substantial 49% lift in the number of watched ads while maintaining similar retention rate.

References

[1]
Jason Gauci, Edoardo Conti, Yitao Liang, Kittipat Virochsiri, Zhengxing Chen, Yuchen He, Zachary Kaden, Vivek Narayanan, and Xiaohui Ye. 2018. Horizon: Facebook’s Open Source Applied Reinforcement Learning Platform. arXiv preprint arXiv:1811.00260 (2018).
[2]
Josiah P. Hanna, Scott Niekum, and Peter Stone. 2019. Importance Sampling Policy Evaluation with an Estimated Behavior Policy. arxiv:1806.01347 [cs.LG]
[3]
Nan Jiang and Lihong Li. 2015. Doubly Robust Off-policy Evaluation for Reinforcement Learning. CoRR abs/1511.03722 (2015). arXiv:1511.03722http://arxiv.org/abs/1511.03722
[4]
Volodymyr Mnih, Koray Kavukcuoglu, David Silver, Alex Graves, Ioannis Antonoglou, Daan Wierstra, and Martin Riedmiller. 2013. Playing Atari with Deep Reinforcement Learning. arxiv:1312.5602 [cs.LG]
[5]
Philipp Moritz, Robert Nishihara, Stephanie Wang, Alexey Tumanov, Richard Liaw, Eric Liang, Melih Elibol, Zongheng Yang, William Paul, Michael I Jordan, 2018. Ray: a distributed framework for emerging AI applications. In Proceedings of the 13th USENIX conference on Operating Systems Design and Implementation. 561–577.
[6]
Ikechukwu Uchendu, Ted Xiao, Yao Lu, Banghua Zhu, Mengyuan Yan, Joséphine Simon, Matthew Bennice, Chuyuan Fu, Cong Ma, Jiantao Jiao, 2023. Jump-start reinforcement learning. In International Conference on Machine Learning. PMLR, 34556–34583.
[7]
Qing Wang, Jiechao Xiong, Lei Han, peng sun, Han Liu, and Tong Zhang. 2018. Exponentially Weighted Imitation Learning for Batched Historical Data. In Advances in Neural Information Processing Systems, S. Bengio, H. Wallach, H. Larochelle, K. Grauman, N. Cesa-Bianchi, and R. Garnett (Eds.). Vol. 31. Curran Associates, Inc.https://proceedings.neurips.cc/paper_files/paper/2018/file/4aec1b3435c52abbdf8334ea0e7141e0-Paper.pdf
[8]
Chengyang Ying, Zhongkai Hao, Xinning Zhou, Hang Su, Dong Yan, and Jun Zhu. 2023. On the Reuse Bias in Off-Policy Reinforcement Learning. In Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, Edith Elkind (Ed.). International Joint Conferences on Artificial Intelligence Organization, 4513–4521. https://doi.org/10.24963/ijcai.2023/502 Main Track.

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
RecSys '24: Proceedings of the 18th ACM Conference on Recommender Systems
October 2024
1438 pages
Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the Owner/Author.

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 08 October 2024

Check for updates

Author Tags

  1. Mobile Games
  2. Off-Policy Evaluation
  3. Reinforcement Learning

Qualifiers

  • Extended-abstract
  • Research
  • Refereed limited

Conference

Acceptance Rates

Overall Acceptance Rate 254 of 1,295 submissions, 20%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 101
    Total Downloads
  • Downloads (Last 12 months)101
  • Downloads (Last 6 weeks)3
Reflects downloads up to 18 Feb 2025

Other Metrics

Citations

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format.

HTML Format

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media