ABSTRACT
A/B testing is a standard approach for evaluating the effect of online experiments; the goal is to estimate the `average treatment effect' of a new feature or condition by exposing a sample of the overall population to it. A drawback with A/B testing is that it is poorly suited for experiments involving social interference, when the treatment of individuals spills over to neighboring individuals along an underlying social network. In this work, we propose a novel methodology using graph clustering to analyze average treatment effects under social interference. To begin, we characterize graph-theoretic conditions under which individuals can be considered to be `network exposed' to an experiment. We then show how graph cluster randomization admits an efficient exact algorithm to compute the probabilities for each vertex being network exposed under several of these exposure conditions. Using these probabilities as inverse weights, a Horvitz-Thompson estimator can then provide an effect estimate that is unbiased, provided that the exposure model has been properly specified.
Given an estimator that is unbiased, we focus on minimizing the variance. First, we develop simple sufficient conditions for the variance of the estimator to be asymptotically small in n, the size of the graph. However, for general randomization schemes, this variance can be lower bounded by an exponential function of the degrees of a graph. In contrast, we show that if a graph satisfies a restricted-growth condition on the growth rate of neighborhoods, then there exists a natural clustering algorithm, based on vertex neighborhoods, for which the variance of the estimator can be upper bounded by a linear function of the degrees. Thus we show that proper cluster randomization can lead to exponentially lower estimator variance when experimentally measuring average treatment effects under interference.
- E. Airoldi, E. Kao, P. Toulis, D. Rubin. Causal estimation of peer influence effects. In ICML, 2013.Google Scholar
- P. Aronow and C. Samii. Estimating average causal effects under general interference. Working Paper, September 2012.Google Scholar
- L. Backstrom and J. Kleinberg. Network bucket testing. In WWW, 2011. Google ScholarDigital Library
- B. Bollobás. Random graphs. Cambridge Univ. Press, 2001.Google ScholarCross Ref
- D. Cellai, A. Lawlor, K. Dawson, J. Gleeson. Critical phenomena in heterogeneous k-core percolation. Phys Rev E, 87(2):022134, 2013.Google ScholarCross Ref
- S. Fienberg. A brief history of statistical models for network analysis and open challenges. J. Comp. Graph. Stat., 2012.Google ScholarCross Ref
- S. Fortunato. Community detection in graphs. Physics Reports, 486(3):75--174, 2010.Google ScholarCross Ref
- A. Gupta, R. Krauthgamer, J. Lee. Bounded geometries, fractals, and low-distortion embeddings. In FOCS, 2003. Google ScholarDigital Library
- D. Horvitz, D. Thompson. A generalization of sampling without replacement from a finite universe. JASA, 1952.Google ScholarCross Ref
- D. Karger, M. Ruhl. Finding nearest neighbors in growth-restricted metrics. In STOC, 2002. Google ScholarDigital Library
- L. Katzir, E. Liberty, O. Somekh. Framework and algorithms for network bucket testing. In WWW, 2012. Google ScholarDigital Library
- R. Kohavi, A. Deng, B. Frasca, R. Longbotham, T. Walker, Y. Xu. Trustworthy online controlled experiments: five puzzling outcomes explained. In KDD, 2012. Google ScholarDigital Library
- C. Manski. Identification of treatment response with social interactions. The Econometrics Journal, 16(1):S1--S23, 2013.Google ScholarCross Ref
- D. Rubin. Estimating causal effects of treatments in randomized and nonrandomized studies. J. Ed. Psych., 1974.Google ScholarCross Ref
- E. Tchetgen, T. VanderWeele. On causal inference in the presence of interference. Stat. Meth. Med. Res., 2012.Google ScholarCross Ref
- J. Ugander, L. Backstrom. Balanced label propagation for partitioning massive graphs. In WSDM, 2013. Google ScholarDigital Library
- D. J. Watts and S. H. Strogatz. Collective dynamics of 'small-world' networks. Nature, 393(6684):440--442, 1998.Google ScholarCross Ref
Index Terms
- Graph cluster randomization: network exposure to multiple universes
Recommendations
Testing Cluster Structure of Graphs
STOC '15: Proceedings of the forty-seventh annual ACM symposium on Theory of ComputingWe study the problem of recognizing the cluster structure of a graph in the framework of property testing in the bounded degree model. Given a parameter ε, a d-bounded degree graph is defined to be (k, φ)-clusterable, if it can be partitioned into no ...
Learning Causal Effects on Hypergraphs
KDD '22: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data MiningHypergraphs provide an effective abstraction for modeling multi-way group interactions among nodes, where each hyperedge can connect any number of nodes. Different from most existing studies which leverage statistical dependencies, we study hypergraphs ...
Dense subgraph mining with a mixed graph model
In this paper we introduce a graph clustering method based on dense bipartite subgraph mining. The method applies a mixed graph model (both standard and bipartite) in a three-phase algorithm. First a seed mining method is applied to find seeds of ...
Comments