Abstract
In this work, we explore the application of sketching data structures to solve problems in graphs that do not fit entirely in memory. These structures allow compact representations of data, admitting some probability of failure. We aim at the implicit representation and dynamic connectivity problems. Our contributions include two new probabilistic implicit representations, one that uses Bloom filters and allows representing sparse graphs with O(|E|) bits, and another that uses MinHash sketches and represents trees with O(|V|) bits. We also describe a variant of an \(\ell _0\)-sampling sketch that allows proving a tighter upper bound on the failure probability of sampling.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Ahn, K.J., Guha, S., McGregor, A.: Analyzing graph structure via linear measurements. In: Proceedings of SODA 2012, pp. 459–467 (2012)
Bloom, B.H.: Space/time trade-offs in hash coding with allowable errors. Commun. ACM 13(7), 422–426 (1970)
Broder, A.Z.: On the resemblance and containment of documents. In: Proceedings of SEQUENCES 1997, pp. 21–29 (1997)
Cormode, G., Firmani, D.: A unifying framework for \(\ell _0\)-sampling algorithms. Distrib. Parallel Databases 32(3), 315–335 (2014)
Cormode, G., Muthukrishnan, S., Rozenbaum, I.: Summarizing and mining inverse distributions on data streams via dynamic inverse sampling. In: Proceedings of VLDB 2005, pp. 25–36 (2005)
Eppstein, D., Galil, Z., Italiano, G.F.: Dynamic graph algorithms (chap. 8). In: Atallah, M.J. (ed.) Algorithms and Theory of Computation Handbook. CRC Press, Boca Raton (1999)
Flajolet, P., Fusy, É., Gandouet, O., Meunier, F.: HyperLogLog: the analysis of a near-optimal cardinality estimation algorithm. In: Proceedings of AofA 2007, pp. 127–146 (2007)
Jowhari, H., Sağlam, M., Tardos, G.: Tight bounds for \(L_p\) samplers, finding duplicates in streams, and related problems. In: Proceedings of PODS 2011, pp. 49–58 (2011)
Kannan, S., Naor, M., Rudich, S.: Implicit representation of graphs. SIAM J. Discret. Math. 5(4), 596–603 (1992)
Li, P., König, A.C.: b-Bit minwise hashing. In: Proceedings of WWW 2010, pp. 671–680 (2010)
Lopes, J.P.A.: Probabilistic data structures applied to implicit graph representation. Master’s thesis, State University of Rio de Janeiro (2017, in Portuguese)
Lopes, J.P.A., Oliveira, F.S., Pinto, P.E.D.: Estimating the intersection cardinality of sets using MinHash and HyperLogLog. In: Proceedings of CNMAC 2016, pp. 010077- 1–2 (2017, in Portuguese)
McGregor, A.: Graph stream algorithms: a survey. ACM SIGMOD Rec. 43(1), 9–20 (2014)
Monemizadeh, M., Woodruff, D.P.: 1-pass relative-error \(L_p\)-sampling with applications. In: Proceedings of SODA 2010, pp. 1143–1160 (2010)
Muller, J.H.: Local structure in graph classes. Ph.D. thesis, Georgia Institute of Technology (1988)
Spinrad, J.P.: Efficient Graph Representations. American Mathematical Society, Providence (2003)
Acknowledgements
The authors acknowledge partial financial support from CNPq, CAPES, and a FAPERJ BBP grant.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this paper
Cite this paper
Lopes, J.P.A., Oliveira, F.S., Pinto, P.E.D., Barbosa, V.C. (2019). Sketching Data Structures for Massive Graph Problems. In: Gadepally, V., Mattson, T., Stonebraker, M., Wang, F., Luo, G., Teodoro, G. (eds) Heterogeneous Data Management, Polystores, and Analytics for Healthcare. DMAH Poly 2018 2018. Lecture Notes in Computer Science(), vol 11470. Springer, Cham. https://doi.org/10.1007/978-3-030-14177-6_5
Download citation
DOI: https://doi.org/10.1007/978-3-030-14177-6_5
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-14176-9
Online ISBN: 978-3-030-14177-6
eBook Packages: Computer ScienceComputer Science (R0)