Abstract
We study a problem arising in statistical analysis called the minimum bottleneck generalized matching problem that involves breaking up a population into blocks in order to carry out generalizable statistical analyses of randomized experiments. At a high level the problem is to find a clustering of the population such that each part is at least a given size and has at least a given number of elements from each treatment class (so that the experiments are statistically significant), and that all elements within a block are as similar as possible (to improve the accuracy of the analysis).
More formally, given a metric space \((V, d)\), a treatment partition \(\mathcal {T} = \{T_1, \ldots , T_k\}\) of \(V\), and a target cardinality vector \((b_0, b_1, \ldots , b_k) \in Z_+^{k+1}\) such that \(b_0 \ge \sum _{j=1}^k b_j\). The objective is to find a partition \(M_1, \ldots , M_\ell \) of V minimizing the maximum diameter of any part such that for each part we have \(|M_i| \ge b_0\) and \(|M_i \cap T_j| \ge b_j\) for all \(j=1, \ldots , k\).
Our main contribution is to provide a tight 2-approximation for the problem. We also show how to modify the algorithm to get the same approximation ratio for the more general problem of finding a partition where each part spans a given matroid.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
Matching here refers to the concept in Statistics. It should not be confused with the traditional concept from Graph Theory.
- 2.
The diameter is defined as the maximum distance between nodes in a set.
- 3.
A partial partition of V is a partition of a subset of V.
References
Aggarwal, G., et al.: Achieving anonymity via clustering. ACM Trans. Algorithms 6(3), 49:1–49:19 (2010)
Ahuja, R.K., Magnanti, T.L., Orlin, J.B.: Network Flows: Theory, Algorithms, and Applications. Prentice Hall, Upper Saddle River (1993)
Charikar, M., Khuller, S., Mount, D.M., Narasimhan, G.: Algorithms for facility location problems with outliers. In: Proceedings of the 12th Annual Symposium on Discrete Algorithms, pp. 642–651 (2001)
Cunningham, W.H.: Improved bounds for matroid partition and intersection algorithms. SIAM J. Comput. 15(4), 948–957 (1986)
Fisher, R.A.: The arrangement of field experiments. J. Ministry Agric. Great Br. 33, 503–513 (1926)
Gonzalez, T.F.: Clustering to minimize the maximum intercluster distance. Theor. Comput. Sci. 38, 293–306 (1985)
Greevy, R., Lu, B., Silber, J.H., Rosenbaum, P.: Optimal multivariate matching before randomization. Biostatistics 5(2), 263–275 (2004)
Higgins, M.J., Sävje, F., Sekhon, J.S.: Improving massive experiments with threshold blocking. Proc. Natl. Acad. Sci. 113(27), 7369–7376 (2016)
Hochbaum, D.S., Shmoys, D.B.: A best possible heuristic for the k-center problem. Math. Oper. Res. 10(2), 180–184 (1985)
Hopcroft, J.E., Karp, R.M.: An n\({}^{\text{5/2 }}\) algorithm for maximum matchings in bipartite graphs. SIAM J. Comput. 2(4), 225–231 (1973)
Khuller, S..: Personal communication (2019)
Khuller, S., Sussmann, Y.J.: The capacitated K-center problem. SIAM J. Discrete Math. 13(3), 403–418 (2000)
Lewis, R.A., Rao, J.M.: The unfavorable economics of measuring the returns to advertising. Q. J. Econ. 130(4), 1941–1973 (2015)
Li, J., Yi, K., Zhang, Q.: Clustering with diversity. In: Proceedings of the 37th International Colloquium on Automata, Languages and Programming, pp. 188–200 (2010)
Li, S., Svensson, O.: Approximating k-median via pseudo-approximation. SIAM J. Comput. 45(2), 530–547 (2016)
Lloyd, S.: Least squares quantization in PCM. IEEE Trans. Inf. Theory 28(2), 129–137 (1982)
Oxley, J.G.: Matroid Theory. Oxford University Press, Oxford (1992)
Rosenbaum, P.R.: Optimal matching for observational studies. J. Am. Stat. Assoc. 84(408), 1024–1032 (1989)
Sävje, F., Higgins, M.J., Sekhon, J.S.: Generalized full matching. CoRR, abs/1703.03882 (2019)
Swamy, C.: Improved approximation algorithms for matroid and knapsack median problems and applications. ACM Trans. Algorithms 12(4), 49:1–49:22 (2016)
Acknowledgement
We would like to thank Jasjeet Sekhon for early discussions on minimum bottleneck generalized matching.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Switzerland AG
About this paper
Cite this paper
Mestre, J., Moses, N.E.S. (2020). Tight Approximation for the Minimum Bottleneck Generalized Matching Problem. In: Kim, D., Uma, R., Cai, Z., Lee, D. (eds) Computing and Combinatorics. COCOON 2020. Lecture Notes in Computer Science(), vol 12273. Springer, Cham. https://doi.org/10.1007/978-3-030-58150-3_26
Download citation
DOI: https://doi.org/10.1007/978-3-030-58150-3_26
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-58149-7
Online ISBN: 978-3-030-58150-3
eBook Packages: Computer ScienceComputer Science (R0)