Abstract
We study the maximum set coverage problem in the massively parallel model. In this setting, m sets that are subsets of a universe of n elements are distributed among m machines. In each round, these machines can communicate with each other, subject to the memory constraint that no machine may use more than \(\tilde{O} \left( n \right) \) memory. The objective is to find the k sets whose coverage is maximized. We consider the regime where \(k = \Omega (m)\) (i.e., \(k = m/100\)), \(m = O(n)\), and each machine has \(\tilde{O} \left( n \right) \) memory\(^1\).
Maximum coverage is a special case of the submodular maximization problem subject to a cardinality constraint. This problem can be approximated to within a \(1-1/e\) factor using the greedy algorithm, but this approach is not directly applicable to parallel and distributed models. When \(k = \Omega (m)\), to obtain a \(1-1/e-\epsilon \) approximation, previous work either requires \(\tilde{O} \left( mn \right) \) memory per machine which is not interesting compared to the trivial algorithm that sends the entire input to a single machine, or requires \(2^{O(1/\epsilon )} n\) memory per machine which is prohibitively expensive even for a moderately small value \(\epsilon \).
Our result is a randomized \((1-1/e-\epsilon )\)-approximation algorithm that uses
rounds. Our algorithm involves solving a slightly transformed linear program of the maximum coverage problem using the multiplicative weights update method, classic techniques in parallel computing such as parallel prefix, and various combinatorial arguments.
This work is supported by the National Science Foundation under Grant No. 2342527.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
The input size is O(mn) and each machine has the memory enough to store a constant number of sets.
References
Anagnostopoulos, A., Becchetti, L., Bordino, I., Leonardi, S., Mele, I., Sankowski, P.: Stochastic query covering for fast approximate document retrieval. ACM Trans. Inf. Syst. 33(3), 11:1–11:35 (2015)
Arora, S., Hazan, E., Kale, S.: The multiplicative weights update method: a meta-algorithm and applications. Theory Comput. 8(1), 121–164 (2012)
Assadi, S.: Tight space-approximation tradeoff for the multi-pass streaming set cover problem. In: PODS, pp. 321–335. ACM (2017)
Assadi, S., Khanna, S.: Tight bounds on the round complexity of the distributed maximum coverage problem. In: SODA, pp. 2412–2431. SIAM (2018)
Assadi, S., Khanna, S., Li, Y.: Tight bounds for single-pass streaming complexity of the set cover problem. SIAM J. Comput. 50(3) (2021)
Cervenjak, P., Gan, J., Umboh, S.W., Wirth, A.: Maximum unique coverage on streams: improved FPT approximation scheme and tighter space lower bound. In: APPROX/RANDOM. LIPIcs, vol. 317, pp. 25:1–25:23. Schloss Dagstuhl - Leibniz-Zentrum für Informatik (2024)
Chakrabarti, A., McGregor, A., Wirth, A.: Improved algorithms for maximum coverage in dynamic and random order streams. CoRR abs/2403.14087 (2024)
Feige, U.: A threshold of ln n for approximating set cover. J. ACM 45(4), 634–652 (1998)
Har-Peled, S., Indyk, P., Mahabadi, S., Vakilian, A.: Towards tight bounds for the streaming set cover problem. In: PODS, pp. 371–383. ACM (2016)
Hochbaum, D.S., Pathria, A.: Analysis of the greedy approach in problems of maximum k-coverage. Naval Res. Logistics (NRL) 45(6), 615–627 (1998)
Indyk, P., Mahabadi, S., Rubinfeld, R., Ullman, J.R., Vakilian, A., Yodpinyanee, A.: Fractional set cover in the streaming model. In: APPROX-RANDOM. LIPIcs, vol. 81, pp. 12:1–12:20. Schloss Dagstuhl - Leibniz-Zentrum für Informatik (2017)
Indyk, P., Vakilian, A.: Tight trade-offs for the maximum k-coverage problem in the general streaming model. In: PODS, pp. 200–217. ACM (2019)
Jaud, S., Wirth, A., Choudhury, F.M.: Maximum coverage in sublinear space, faster. In: SEA. LIPIcs, vol. 265, pp. 21:1–21:20. Schloss Dagstuhl - Leibniz-Zentrum für Informatik (2023)
Karloff, H.J., Suri, S., Vassilvitskii, S.: A model of computation for MapReduce. In: SODA, pp. 938–948. SIAM (2010)
Kempe, D., Kleinberg, J.M., Tardos, É.: Maximizing the spread of influence through a social network. Theory Comput. 11, 105–147 (2015)
Khanna, S., Konrad, C., Alexandru, C.: Set cover in the one-pass edge-arrival streaming model. In: PODS, pp. 127–139. ACM (2023)
Krause, A., Guestrin, C.: Near-optimal observation selection using submodular functions. In: AAAI, pp. 1650–1654. AAAI Press (2007)
Kumar, R., Moseley, B., Vassilvitskii, S., Vattani, A.: Fast greedy algorithms in MapReduce and streaming. ACM Trans. Parallel Comput. 2(3), 14:1–14:22 (2015)
Ladner, R.E., Fischer, M.J.: Parallel prefix computation. J. ACM 27(4), 831–838 (1980)
Liu, P., Vondrák, J.: Submodular optimization in the MapReduce model. In: SOSA. OASIcs, vol. 69, pp. 18:1–18:10. Schloss Dagstuhl - Leibniz-Zentrum für Informatik (2019)
McGregor, A., Tench, D., Vu, H.T.: Maximum coverage in the data stream model: Parameterized and generalized. In: ICDT. LIPIcs, vol. 186, pp. 12:1–12:20. Schloss Dagstuhl - Leibniz-Zentrum für Informatik (2021)
McGregor, A., Vu, H.T.: Better streaming algorithms for the maximum coverage problem. Theory Comput. Syst. 63(7), 1595–1619 (2019)
da Ponte Barbosa, R., Ene, A., Nguyen, H.L., Ward, J.: A new framework for distributed submodular maximization. In: FOCS, pp. 645–654. IEEE Computer Society (2016)
Saha, B., Getoor, L.: On maximum coverage in the streaming model & application to multi-topic blog-watch. In: SDM, pp. 697–708. SIAM (2009)
Warneke, R., Choudhury, F.M., Wirth, A.: Maximum coverage in random-arrival streams. In: ESA. LIPIcs, vol. 274, pp. 102:1–102:15. Schloss Dagstuhl - Leibniz-Zentrum für Informatik (2023)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2025 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Bui, T., Vu, H.T. (2025). Massively Parallel Maximum Coverage Revisited. In: Královič, R., Kůrková, V. (eds) SOFSEM 2025: Theory and Practice of Computer Science. SOFSEM 2025. Lecture Notes in Computer Science, vol 15538. Springer, Cham. https://doi.org/10.1007/978-3-031-82670-2_13
Download citation
DOI: https://doi.org/10.1007/978-3-031-82670-2_13
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-82669-6
Online ISBN: 978-3-031-82670-2
eBook Packages: Computer ScienceComputer Science (R0)