skip to main content
10.1145/3580305.3599308acmconferencesArticle/Chapter ViewAbstractPublication PageskddConference Proceedingsconference-collections
research-article

Detecting Interference in Online Controlled Experiments with Increasing Allocation

Published: 04 August 2023 Publication History

Abstract

In the past decade, the technology industry has adopted online controlled experiments (a.k.a. A/B testing) to guide business decisions. In practice, A/B tests are often implemented with increasing treatment allocation: the new treatment is gradually released to an increasing number of units through a sequence of randomized experiments. In scenarios such as experimenting in a social network setting or in a bipartite online marketplace, interference among units may exist, which can harm the validity of simple inference procedures. In this work, we introduce a widely applicable procedure to test for interference in A/B testing with increasing allocation. Our procedure can be implemented on top of an existing A/B testing platform with a separate flow and does not require a priori a specific interference mechanism. In particular, we introduce two permutation tests that are valid under different assumptions. Firstly, we introduce a general statistical test for interference requiring no additional assumption. Secondly, we introduce a testing procedure that is valid under a time fixed effect assumption. The testing procedure is of very low computational complexity, it is powerful, and it formalizes a heuristic algorithm implemented already in industry. We demonstrate the performance of the proposed testing procedure through simulations on synthetic data. Finally, we discuss one application at LinkedIn, where a screening step is implemented to detect potential interference in all their marketplace experiments with the proposed methods in the paper.

Supplementary Material

MP4 File (rtfp1086-2min-promo.mp4)
We develop easy-to-use and computationally efficient statistical tests to detect interference in A/B tests. The resulting tests can be added as a filter in existing experimentation pipelines to send different experiments to different inference platforms.

References

[1]
Joshua D Angrist and Jörn-Steffen Pischke. 2009. Mostly harmless econometrics: An empiricist's companion. Princeton university press.
[2]
Peter M. Aronow. 2012. A General Method for Detecting Interference Between Units in Randomized Experiments. Sociological Methods & Research, Vol. 41, 1 (2012), 3--16.
[3]
Peter M. Aronow and Cyrus Samii. 2017. ESTIMATING AVERAGE CAUSAL EFFECTS UNDER GENERAL INTERFERENCE, WITH APPLICATION TO A SOCIAL NETWORK EXPERIMENT. The Annals of Applied Statistics, Vol. 11, 4 (2017), 1912--1947.
[4]
Susan Athey, Dean Eckles, and Guido W Imbens. 2018. Exact p-values for network interference. J. Amer. Statist. Assoc., Vol. 113, 521 (2018), 230--240.
[5]
Patrick Bajari, Brian Burdick, Guido W Imbens, Lorenzo Masoero, James McQueen, Thomas Richardson, and Ido M Rosen. 2021. Multiple randomization designs. arXiv preprint arXiv:2112.13495 (2021).
[6]
Eytan Bakshy, Dean Eckles, and Michael S Bernstein. 2014. Designing and deploying online field experiments. In Proceedings of the 23rd international conference on World wide web. 283--292.
[7]
Guillaume Basse and Avi Feller. 2018. Analyzing Two-Stage Experiments in the Presence of Interference. J. Amer. Statist. Assoc., Vol. 113, 521 (2018), 41--55. https://doi.org/10.1080/01621459.2017.1323641
[8]
Guillaume W. Basse and Edoardo M. Airoldi. 2018. Limitations of Design-based Causal Inference and A/B Testing under Arbitrary and Network Interference. Sociological Methodology, Vol. 48, 1 (2018), 136--151. https://doi.org/10.1177/0081175018782569
[9]
G W Basse, A Feller, and P Toulis. 2019. Randomization tests of causal effects under interference. Biometrika, Vol. 106, 2 (02 2019), 487--494.
[10]
Guillaume W Basse, Hossein Azari Soufiani, and Diane Lambert. 2016. Randomization and the pernicious effects of limited budgets on auction experiments. In Artificial Intelligence and Statistics. PMLR, 1412--1420.
[11]
Marianne Bertrand, Esther Duflo, and Sendhil Mullainathan. 2004. How much should we trust differences-in-differences estimates? The Quarterly journal of economics, Vol. 119, 1 (2004), 249--275.
[12]
Rohit Bhattacharya, Daniel Malinsky, and Ilya Shpitser. 2020. Causal inference under interference and network uncertainty. In Uncertainty in Artificial Intelligence. PMLR, 1028--1038.
[13]
Iavor Bojinov, Ashesh Rambachan, and Neil Shephard. 2021. Panel experiments and dynamic causal effects: A finite population perspective. Quantitative Economics, Vol. 12, 4 (2021), 1171--1196.
[14]
Jake Bowers, Mark M. Fredrickson, and Costas Panagopoulos. 2013. Reasoning about Interference Between Units: A General Framework. Political Analysis, Vol. 21, 1 (2013), 97--124. https://doi.org/10.1093/pan/mps038
[15]
Ariel Boyarsky, Hongseok Namkoong, and Jean Pouget-Abadie. 2023. Modeling Interference Using Experiment Roll-out. arXiv preprint arXiv:2305.10728 (2023).
[16]
Mayleen Cortez, Matthew Eichhorn, and Christina Lee Yu. 2022. Graph Agnostic Estimators with Staggered Rollout Designs under Network Interference. arXiv preprint arXiv:2205.14552 (2022).
[17]
Dean Eckles, Brian Karrer, and Johan Ugander. 2017. Design and Analysis of Experiments in Networks: Reducing Bias from Interference. Journal of Causal Inference, Vol. 5, 1 (2017), 20150021.
[18]
Andrey Fradkin. 2019. A simulation approach to designing digital matching platforms. Boston University Questrom School of Business Research Paper Forthcoming (2019).
[19]
Yasunori Fujikoshi. 1993. Two-way ANOVA models with unbalanced data. Discrete Mathematics, Vol. 116, 1--3 (1993), 315--334.
[20]
Kevin Han, Shuangning Li, Jialiang Mao, and Han Wu. 2022. Detecting Interference in A/B Testing with Increasing Allocation. arXiv preprint arXiv:2211.03262 (2022).
[21]
Kevin Wu Han, Iavor Bojinov, and Guillaume Basse. 2021. Population Interference in Panel Experiments. https://doi.org/10.48550/ARXIV.2103.00553
[22]
Jesse Hemerik and Jelle Goeman. 2018a. Exact testing with random permutations. Test, Vol. 27, 4 (2018), 811--825.
[23]
Jesse Hemerik and Jelle J Goeman. 2018b. False discovery proportion estimation by permutations: confidence for significance analysis of microarrays. Journal of the Royal Statistical Society: Series B (Statistical Methodology), Vol. 80, 1 (2018), 137--155.
[24]
David Holtz, Ruben Lobel, Inessa Liskovich, and Sinan Aral. 2020. Reducing interference bias in online marketplace pricing experiments. arXiv preprint arXiv:2004.12489 (2020).
[25]
Yuchen Hu, Shuangning Li, and Stefan Wager. 2022. Average direct and indirect causal effects under interference. Biometrika (02 2022). https://doi.org/10.1093/biomet/asac008 asac008.
[26]
Michael G Hudgens and M. Elizabeth Halloran. 2008. Toward Causal Inference With Interference. J. Amer. Statist. Assoc., Vol. 103, 482 (2008), 832--842.
[27]
Guido W. Imbens and Donald B. Rubin. 2015. Causal Inference for Statistics, Social, and Biomedical Sciences: An Introduction. Cambridge University Press. https://doi.org/10.1017/CBO9781139025751
[28]
Alexander Ivaniuk. 2020. Our evolution towards T-REX: The prehistory of experimentation infrastructure at LinkedIn. LinkedIn Engineering Blog (2020).
[29]
Ramesh Johari, Hannah Li, Inessa Liskovich, and Gabriel Y Weintraub. 2022. Experimental design in two-sided platforms: An analysis of bias. Management Science (2022).
[30]
Ron Kohavi, Alex Deng, Brian Frasca, Toby Walker, Ya Xu, and Nils Pohlmann. 2013. Online Controlled Experiments at Large Scale. In Proceedings of the 19th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (Chicago, Illinois, USA) (KDD '13). Association for Computing Machinery, New York, NY, USA, 1168--1176. https://doi.org/10.1145/2487575.2488217
[31]
Ron Kohavi, Diane Tang, and Ya Xu. 2020. Trustworthy Online Controlled Experiments: A Practical Guide to A/B Testing. Cambridge University Press. https://doi.org/10.1017/9781108653985
[32]
Michael P. Leung. 2020. Treatment and Spillover Effects Under Network Interference. The Review of Economics and Statistics, Vol. 102, 2 (05 2020), 368--380. https://doi.org/10.1162/rest_a_00818
[33]
Hannah Li, Geng Zhao, Ramesh Johari, and Gabriel Y Weintraub. 2022. Interference, bias, and variance in two-sided marketplace experimentation: Guidance for platforms. In Proceedings of the ACM Web Conference 2022. 182--192.
[34]
Shuangning Li and Stefan Wager. 2022. Random graph asymptotics for treatment effect estimation under network interference. The Annals of Statistics, Vol. 50, 4 (2022), 2334--2358. https://doi.org/10.1214/22-AOS2191
[35]
Min Liu, Jialiang Mao, and Kang Kang. 2021. Trustworthy and Powerful Online Marketplace Experimentation with Budget-split Design. In Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining. 3319--3329.
[36]
Jialiang Mao and Iavor Bojinov. 2021. Quantifying the Value of Iterative Experimentation. arXiv preprint arXiv:2111.02334 (2021).
[37]
Jean Pouget-Abadie, Kevin Aydin, Warren Schudy, Kay Brodersen, and Vahab Mirrokni. 2019a. Variance reduction in bipartite experiments through correlation clustering. Advances in Neural Information Processing Systems, Vol. 32 (2019).
[38]
Jean Pouget-Abadie, Guillaume Saint-Jacques, Martin Saveski, Weitao Duan, Souvik Ghosh, Ya Xu, and Edoardo M Airoldi. 2019b. Testing for arbitrary interference on experimentation platforms. Biometrika, Vol. 106, 4 (2019), 929--940.
[39]
David Puelz, Guillaume Basse, Avi Feller, and Panos Toulis. 2022. A graph-theoretic approach to randomization tests of causal effects under general interference. Journal of the Royal Statistical Society Series B, Vol. 84, 1 (February 2022), 174--204.
[40]
Donald B Rubin. 1973. Matching to remove bias in observational studies. Biometrics (1973), 159--183.
[41]
Martin Saveski, Jean Pouget-Abadie, Guillaume Saint-Jacques, Weitao Duan, Souvik Ghosh, Ya Xu, and Edoardo M Airoldi. 2017. Detecting network effects: Randomizing over randomized experiments. In Proceedings of the 23rd ACM SIGKDD international conference on knowledge discovery and data mining. 1027--1035.
[42]
Fredrik S"avje, Peter M. Aronow, and Michael G. Hudgens. 2021. Average treatment effects in the presence of unknown interference. The Annals of Statistics, Vol. 49, 2 (2021), 673 -701. https://doi.org/10.1214/20-AOS1973
[43]
Michael E Sobel. 2006. What Do Randomized Studies of Housing Mobility Demonstrate? J. Amer. Statist. Assoc., Vol. 101, 476 (2006), 1398--1407.
[44]
Elizabeth A Stuart. 2010. Matching methods for causal inference: A review and a look forward. Statistical science: a review journal of the Institute of Mathematical Statistics, Vol. 25, 1 (2010), 1.
[45]
Daniel L Sussman and Edoardo M Airoldi. 2017. Elements of estimation theory for causal effects in the presence of network interference. arXiv preprint arXiv:1702.03578 (2017).
[46]
Fredrik Sävje. 2021. Causal inference with misspecified exposure mappings. https://doi.org/10.48550/ARXIV.2103.06471
[47]
Diane Tang, Ashish Agarwal, Deirdre O'Brien, and Mike Meyer. 2010. Overlapping experiment infrastructure: More, better, faster experimentation. In Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining. 17--26.
[48]
Eric J Tchetgen Tchetgen and Tyler J VanderWeele. 2012. On causal inference in the presence of interference. Statistical Methods in Medical Research, Vol. 21, 1 (2012), 55--75.
[49]
Panos Toulis and Edward Kao. 2013. Estimation of causal peer influence effects. In International conference on machine learning. PMLR, 1489--1497.
[50]
Amanda L. Traud, Peter J. Mucha, and Mason A. Porter. 2012. Social structure of Facebook networks. Physica A: Statistical Mechanics and its Applications, Vol. 391, 16 (2012), 4165--4180. https://doi.org/10.1016/j.physa.2011.12.021
[51]
Johan Ugander, Brian Karrer, Lars Backstrom, and Jon Kleinberg. 2013. Graph Cluster Randomization: Network Exposure to Multiple Universes. In Proceedings of the 19th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (Chicago, Illinois, USA) (KDD '13). Association for Computing Machinery, New York, NY, USA, 329--337.
[52]
Davide Viviano. 2020. Experimental design under network interference. arXiv preprint arXiv:2003.08421 (2020).
[53]
Stefan Wager and Kuang Xu. 2021. Experimenting in equilibrium. Management Science, Vol. 67, 11 (2021), 6694--6715.
[54]
Ya Xu, Nanyu Chen, Addrian Fernandez, Omar Sinno, and Anmol Bhasin. 2015. From infrastructure to culture: A/B testing challenges in large scale social networks. In Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 2227--2236.
[55]
Ya Xu, Weitao Duan, and Shaochen Huang. 2018. SQR: Balancing speed, quality and risk in online experiments. In Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. 895--904.
[56]
Frank Yates. 1934. The analysis of multiple classifications with unequal numbers in the different classes. J. Amer. Statist. Assoc., Vol. 29, 185 (1934), 51--66.

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
KDD '23: Proceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining
August 2023
5996 pages
ISBN:9798400701030
DOI:10.1145/3580305
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 04 August 2023

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. a/b testing
  2. causal inference
  3. hypothesis testing
  4. network effects
  5. network interference

Qualifiers

  • Research-article

Conference

KDD '23
Sponsor:

Acceptance Rates

Overall Acceptance Rate 1,133 of 8,635 submissions, 13%

Upcoming Conference

KDD '25

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 597
    Total Downloads
  • Downloads (Last 12 months)256
  • Downloads (Last 6 weeks)27
Reflects downloads up to 08 Mar 2025

Other Metrics

Citations

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media