skip to main content
10.1145/3523227.3547398acmotherconferencesArticle/Chapter ViewAbstractPublication PagesrecsysConference Proceedingsconference-collections
invited-talk

Estimating Long-term Effects from Experimental Data

Published: 13 September 2022 Publication History

Abstract

A/B testing is a powerful tool for a company to make informed decisions about their services and products. A limitation of A/B tests is that they do not easily extend to measure post-experiment (long-term) differences. In this talk, we study a different approach inspired by recent advances in off-policy evaluation in reinforcement learning (RL). The basic RL approach assumes customer behavior follows a stationary Markovian process, and estimates the average engagement metric when the process reaches the steady state. However, in realistic scenarios, the stationary assumption is often violated due to weekly variations and seasonality effects. To tackle this challenge, we propose a variation by relaxing the stationary assumption. We empirically tested both stationary and nonstationary approaches in a synthetic dataset and an online store dataset.

Supplementary Material

MP4 File (Estimating Long-term Effects from Experimental Data.mp4)
Presentation video

References

[1]
Susan Athey, Raj Chetty, Guido W Imbens, and Hyunseung Kang. 2019. The surrogate index: Combining short-term proxies to estimate long-term treatment effects more rapidly and precisely. Technical Report. National Bureau of Economic Research.
[2]
Bo Dai, Ofir Nachum, Yinlam Chow, Lihong Li, Csaba Szepesvári, and Dale Schuurmans. 2020. CoinDICE: Off-Policy Confidence Interval Estimation. In Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, NeurIPS 2020, December 6-12, 2020, virtual.
[3]
Botao Hao, Xiang Ji, Yaqi Duan, Hao Lu, Csaba Szepesvári, and Mengdi Wang. 2021. Bootstrapping Fitted Q-Evaluation for Off-Policy Inference. In Proceedings of the 38th International Conference on Machine Learning, ICML 2021, 18-24 July 2021, Virtual Event(Proceedings of Machine Learning Research, Vol. 139). PMLR, 4074–4084.
[4]
Ronny Kohavi, Thomas Crook, Roger Longbotham, Brian Frasca, Randy Henne, Juan Lavista Ferres, and Tamir Melamed. 2009. Online experimentation at Microsoft. Data Mining Case Studies 11, 2009 (2009), 39.
[5]
Ron Kohavi, Diane Tang, and Ya Xu. 2020. Trustworthy Online Controlled Experiments: A Practical Guide to A/B Testing. Cambridge University Press.
[6]
Qiang Liu, Lihong Li, Ziyang Tang, and Dengyong Zhou. 2018. Breaking the Curse of Horizon: Infinite-horizon Off-policy Estimation. In Advances in Neural Information Processing Systems 31 (NIPS-18). 5361–5371.
[7]
Ofir Nachum, Yinlam Chow, Bo Dai, and Lihong Li. 2019. DualDICE: Behavior-Agnostic Estimation of Discounted Stationary Distribution Corrections. In Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, NeurIPS 2019, December 8-14, 2019, Vancouver, BC, Canada. 2315–2325.
[8]
Ronald Parr, Lihong Li, Gavin Taylor, Christopher Painter-Wakefield, and Michael L Littman. 2008. An analysis of linear models, linear value-function approximation, and feature selection for reinforcement learning. In Proceedings of the 25th international conference on Machine learning. 752–759.

Cited By

View all
  • (2024)Learning Metrics that Maximise Power for Accelerated A/B-TestsProceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining10.1145/3637528.3671512(5183-5193)Online publication date: 25-Aug-2024
  • (2023)International Workshop on Deep Learning Practice for High-Dimensional Sparse Data with RecSys 2023Proceedings of the 17th ACM Conference on Recommender Systems10.1145/3604915.3608765(1276-1280)Online publication date: 14-Sep-2023

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences
RecSys '22: Proceedings of the 16th ACM Conference on Recommender Systems
September 2022
743 pages
Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the Owner/Author.

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 13 September 2022

Check for updates

Author Tags

  1. A/B testing
  2. off-policy evaluation
  3. reinforcement learning

Qualifiers

  • Invited-talk
  • Research
  • Refereed limited

Conference

Acceptance Rates

Overall Acceptance Rate 254 of 1,295 submissions, 20%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)25
  • Downloads (Last 6 weeks)7
Reflects downloads up to 05 Mar 2025

Other Metrics

Citations

Cited By

View all
  • (2024)Learning Metrics that Maximise Power for Accelerated A/B-TestsProceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining10.1145/3637528.3671512(5183-5193)Online publication date: 25-Aug-2024
  • (2023)International Workshop on Deep Learning Practice for High-Dimensional Sparse Data with RecSys 2023Proceedings of the 17th ACM Conference on Recommender Systems10.1145/3604915.3608765(1276-1280)Online publication date: 14-Sep-2023

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format.

HTML Format

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media