Abstract
In this paper, we design sub-linear space streaming algorithms for estimating three fundamental parameters – maximum independent set, minimum dominating set and maximum matching – on sparse graph classes, i.e., graphs which satisfy \(m=O(n)\) where m, n is the number of edges, vertices respectively. Each graph parameter we consider can have size \(\varOmega (n)\) even on sparse graph classes, and hence for sublinear-space algorithms we are restricted to parameter estimation instead of attempting to find a solution. We obtain these results:
-
Estimating Max Independent Set via the Caro-Wei bound: Caro and Wei each showed \(\lambda = \sum _{v} {1}/(d(v) + 1)\) is a lower bound on max independent set size, where vertex v has degree d(v). If average degree, \(\bar{d}\), is \(\mathcal {O}(1)\), and max degree \(\varDelta = \mathcal {O}({\varepsilon }^{2} \bar{d}^{-3} n)\), our algorithms, with at least \(1 - \delta \) success probability:
-
In online streaming, return an actual independent set of size \(1 \pm {\varepsilon }\) times \(\lambda \). This improves on Halldórsson et al. [Algorithmica ’16]: we have less working space, i.e., \(\mathcal {O}(\log {\varepsilon }^{-1} \cdot \log n \cdot \log \delta ^{-1})\), faster updates, i.e., \(\mathcal {O}(\log {\varepsilon }^{-1})\), and bounded success probability.
-
In insertion-only streams, approximate \(\lambda \) within factor \(1 \pm {\varepsilon }\), in one pass, in \(\mathcal {O}(\bar{d} {\varepsilon }^{-2} \log n \cdot \log \delta ^{-1})\) space. This aligns with the result of Cormode et al. [ISCO ’18], though our method also works for online streaming. In a vertex-arrival and random-order stream, space reduces to \(\mathcal {O}(\log (\bar{d} {\varepsilon }^{-1}))\). With extra space and post-processing step, we remove the max-degree constraint.
-
-
Sublinear-Space Algorithms on Forests: On a forest, Esfandiari et al. [SODA ’15, TALG ’18] showed space lower bounds for 1-pass randomized algorithms that approximately estimate these graph parameters. We narrow the gap between upper and lower bounds:
-
Max independent set size within \(3/2 \cdot (1 \pm {\varepsilon })\) in one pass and in \(\log ^{\mathcal {O}(1)} n\) space, and within \(4/3\cdot (1 \pm {\varepsilon })\) in two passes and in \(\tilde{\mathcal {O}}(\sqrt{n})\) space; the lower bound is for approx. \(\le 4/3\).
-
Min dominating set size within \(3 \cdot (1 \pm {\varepsilon })\) in one pass and in \(\log ^{\mathcal {O}(1)} n\) space, and within \(2\cdot (1 \pm {\varepsilon })\) in two passes and in \(\tilde{\mathcal {O}}(\sqrt{n})\) space; the lower bound is for approx. \(\le 3/2\).
-
Max matching size within \(2 \cdot (1 \pm {\varepsilon })\) in one pass and in \(\log ^{\mathcal {O}(1)} n\) space, and within \(3/2\cdot (1 \pm {\varepsilon })\) in two passes and in \(\tilde{\mathcal {O}}(\sqrt{n})\) space; the lower bound is for approx. \(\le 3/2\).
-
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
Full version is at https://doi.org/10.48550/arXiv.2305.16815.
- 2.
This includes planar graphs, bounded treewidth, bounded genus, H-minor-free, etc.
- 3.
The relative error between the estimate and the actual value.
- 4.
\(\varOmega (\sqrt{n})\) (or \(\varOmega (n)\)) space is required for randomized (or deterministic) algorithms.
- 5.
\(\tilde{\mathcal {O}}\)-notation suppresses the poly-logarithmic factor in the bound.
- 6.
Due to space constraints, each result labeled \([\star ]\) has its proof in the full version (See footnote 1).
- 7.
Unless otherwise stated, we assume that the input forest has no isolated vertices.
- 8.
The variable c of Table 1 is now out of scope.
References
Araujo, F., Farinha, J., Domingues, P., Silaghi, G.C., Kondo, D.: A maximum independent set approach for collusion detection in voting pools. J. Parallel Distrib. Comput. 71(10), 1356–1366 (2011)
Bauckmann, J., Abedjan, Z., Leser, U., Müller, H., Naumann, F.: Discovering conditional inclusion dependencies. In: CIKM 2012, pp. 2094–2098 (2012)
Boppana, R.B., Halldórsson, M.M., Rawitz, D.: Simple and local independent set approximation. In: SIROCCO 2018, pp. 88–101 (2018)
Bury, M., et al.: Structural results on matching estimation with applications to streaming. Algorithmica 81(1), 367–392 (2019)
Bury, M., Schwiegelshohn, C.: Sublinear estimation of weighted matchings in dynamic data streams. In: Bansal, N., Finocchi, I. (eds.) ESA 2015. LNCS, vol. 9294, pp. 263–274. Springer, Heidelberg (2015). https://doi.org/10.1007/978-3-662-48350-3_23
Caro, Y.: New results on the independence number. Technical report, Tel-Aviv University (1979)
Cormode, G., Dark, J., Konrad, C.: Approximating the Caro-Wei bound for independent sets in graph streams. In: Lee, J., Rinaldi, G., Mahjoub, A.R. (eds.) ISCO 2018. LNCS, vol. 10856, pp. 101–114. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-96151-4_9
Cormode, G., Firmani, D.: A unifying framework for \(\ell _0\)-sampling algorithms. Distrib. Parallel Databases 32(3), 315–335 (2014)
Cormode, G., Jowhari, H., Monemizadeh, M., Muthukrishnan, S.: The sparse awakens: streaming algorithms for matching size estimation in sparse graphs. In: ESA 2017, pp. 29:1–29:15 (2017)
Cormode, G., Muthukrishnan, S.: An improved data stream summary: the count-min sketch and its applications. J. Algorithms 55(1), 58–75 (2005)
Sarma, A.D., et al.: Finding related tables. In: SIGMOD 2012, pp. 817–828 (2012)
DeLaVina, E., Larson, C.E., Pepper, R., Waller, B., Favaron, O.: On total domination and support vertices of a tree. AKCE Int. J. Graphs Comb. 7(1), 85–95 (2010)
Deng, D., et al.: The data civilizer system. In: CIDR (2017)
Eidenbenz, S.J.: Online dominating set and variations on restricted graph classes. Technical report/ETH Zurich, Department of Computer Science 380 (2002)
Esfandiari, H., Hajiaghayi, M.T., Liaghat, V., Monemizadeh, M., Onak, K.: Streaming algorithms for estimating the matching size in planar graphs and beyond. In: SODA 2015, pp. 1217–1233 (2015)
Feigenbaum, J., Kannan, S., McGregor, A., Suri, S., Zhang, J.: On graph problems in a semi-streaming model. Theor. Comp. Sci. 348(2–3), 207–216 (2005)
Ganti, V., Sarma, A.D.: Data cleaning: a practical perspective. Synth. Lect. Data Manage. 5(3), 1–85 (2013)
Gemsa, A., Nöllenburg, M., Rutter, I.: Evaluation of labeling strategies for rotating maps. J. Exp. Algorithmics (JEA) 21, 1–21 (2016)
Halldórsson, B.V., Halldórsson, M.M., Losievskaja, E., Szegedy, M.: Streaming algorithms for independent sets in sparse hypergraphs. Algorithmica 76(2), 490–501 (2016)
Halldórsson, M.M., Radhakrishnan, J.: Greed is good: approximating independent sets in sparse and bounded-degree graphs. Algorithmica 18(1), 145–163 (1997)
Hossain, A.: Automated design of thousands of nonrepetitive parts for engineering stable genetic systems. Nat. Biotech. 38(12), 1466–1475 (2020)
Indyk, P.: A small approximately min-wise independent family of hash functions. J. Algorithms 38(1), 84–90 (2001)
Jayaram, R., Woodruff, D.P.: Data streams with bounded deletions. In: PODS 2018, pp. 341–354 (2018)
Johnson, D.S.: Approximation algorithms for combinatorial problems. J. Comput. Syst. Sci. 9(3), 256–278 (1974)
Kane, D.M., Nelson, J., Woodruff, D.P.: On the exact space complexity of sketching and streaming small norms. In: SODA 2010, pp. 1161–1178 (2010)
Kane, D.M., Nelson, J., Woodruff, D.P.: An optimal algorithm for the distinct elements problem. In: PODS 2010, pp. 41–52 (2010)
Karp, R.M.: Reducibility among combinatorial problems. In: Miller, R.E., Thatcher, J.W., Bohlinger, J.D. (eds.) Complexity of Computer Computations. The IBM Research Symposia Series, pp. 85–103. Springer, Boston (1972). https://doi.org/10.1007/978-1-4684-2001-2_9
Kieritz, T., Luxen, D., Sanders, P., Vetter, C.: Distributed time-dependent contraction hierarchies. In: SEA 2010, pp. 83–93 (2010)
Lemańska, M.: Lower bound on the domination number of a tree. Discussiones Math. Graph Theory 24(2), 165–169 (2004)
Lovász, L.: On the ratio of optimal integral and fractional covers. Discrete Math. 13(4), 383–390 (1975)
Meierling, D., Volkmann, L.: A lower bound for the distance \(k\)-domination number of trees. Results Math. 47(3–4), 335–339 (2005)
Milenković, T., Memišević, V., Bonato, A., Pržulj, N.: Dominating biological networks. PLOS One 6(8), 0023016 (2011)
Nacher, J.C., Akutsu, T.: Dominating scale-free networks with variable scaling exponent: heterogeneous networks are not difficult to control. New J. Phys. 14(7), 073005 (2012)
Panconesi, A., Srinivasan, A.: Randomized distributed edge coloring via an extension of the Chernoff-Hoeffding bounds. SICOMP 26(2), 350–368 (1997)
Pino, T., Choudhury, S., Al-Turjman, F.: Dominating set algorithms for wireless sensor networks survivability. IEEE Access 6, 17527–17532 (2018)
Shen, C., Li, T.: Multi-document summarization via the minimum dominating set. In: COLING 2010, pp. 984–992 (2010)
Turán, P.: On an extremal problem in graph theory. Mat. Fiz. Lapok, 436–452 (1941)
Wang, J., Li, G., Fe, J.: Fast-join: an efficient method for fuzzy token matching based string similarity join. In: ICDE 2011, pp. 458–469 (2011)
Wei, V.: A lower bound on the stability number of a simple graph. Technical report, Bell Laboratories Technical Memorandum (1981)
Yu, J., Wang, N., Wang, G., Yu, D.: Connected dominating sets in wireless ad hoc and sensor networks-a comprehensive survey. Comput. Commun 36, 121–134 (2013)
Acknowledgement
We would like to thank Robert Krauthgamer for helpful discussions.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Ethics declarations
Author note
Xiuge Chen is now with Optiver, Sydney. Patrick Eades is now with The University of Sydney.
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Chen, X., Chitnis, R., Eades, P., Wirth, A. (2023). Sublinear-Space Streaming Algorithms for Estimating Graph Parameters on Sparse Graphs. In: Morin, P., Suri, S. (eds) Algorithms and Data Structures. WADS 2023. Lecture Notes in Computer Science, vol 14079. Springer, Cham. https://doi.org/10.1007/978-3-031-38906-1_17
Download citation
DOI: https://doi.org/10.1007/978-3-031-38906-1_17
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-38905-4
Online ISBN: 978-3-031-38906-1
eBook Packages: Computer ScienceComputer Science (R0)