Abstract
We consider the problem of computing local clusters in large graphs distributed across nodes in a network using two different models of distributed computation. We give a distributed algorithm that computes a local cluster in time that depends only logarithmically on the size of the graph in the CONGEST model. In particular, when the conductance of the optimal local cluster is known, the algorithm runs in time entirely independent of the size of the graph and depends only on error bounds for approximation. We also show that the local cluster problem can be computed in the k-machine distributed model in sublinear time. The speedup of our local cluster algorithms is mainly due to the use of our distributed algorithm for heat kernel pagerank.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Andersen, R., Chung, F., Lang, K.: Local graph partitioning using pagerank vectors. In: FOCS, pp. 475–486. IEEE (2006)
Andersen, R., Peres, Y.: Finding sparse cuts locally using evolving sets. In: STOC, pp. 235–244. ACM (2009)
Arora, S., Rao, S., Vazirani, U.: Expander flows, geometric embeddings and graph partitioning. JACM 56(2), 1–37 (2009). Article no. 5
Brin, S., Page, L.: The anatomy of a large-scale hypertextual web search engine. Comput. Netw. ISDN Syst. 30(1), 107–117 (1998)
Chung, F.: Spectral Graph Theory. American Mathematical Society, Providence (1997)
Chung, F., Simpson, O.: Computing heat kernel pagerank and a local clustering algorithm. In: Jan, K., Miller, M., Froncek, D. (eds.) IWOCA 2014. LNCS, vol. 8986, pp. 110–121. Springer, Heidelberg (2015)
Chung, F., Simpson, O.: Computing heat kernel pagerank and a local clustering algorithm. arXiv preprint arXiv:1503.03155 (2015)
Chung, F., Simpson, O.: Distributed algorithms for finding local clusters using heat kernel pagerank. arXiv preprint arXiv:1507.08967 (2015)
Das Sarma, A., Molla, A.R., Pandurangan, G.: Distributed computation of sparse cuts via random walks. In: ICDCN, pp. 6:1–6:10 (2015)
Das Sarma, A., Molla, A.R., Pandurangan, G., Upfal, E.: Fast distributed pagerank computation. In: Frey, D., Raynal, M., Sarkar, S., Shyamasundar, R.K., Sinha, P. (eds.) ICDCN 2013. LNCS, vol. 7730, pp. 11–26. Springer, Heidelberg (2013)
Das Sarma, A., Nanongkai, D., Pandurangan, G., Tetali, P.: Distributed random walks. JACM 60(1), 201–210 (2013). Article no. 2
Dean, J., Ghemawat, S.: Mapreduce: simplified data processing on large clusters. In: OSDI (2004)
Gharan, S.O., Trevisan, L.: Approximating the expansion profile and almost optimal local graph clustering. In: FOCS, pp. 187–196. IEEE (2012)
Klauck, H., Nanongkai, D., Pandurangan, G., Robinson, P.: Distributed computation of large-scale graph problems. In: SODA, pp. 391–410. SIAM (2015)
Kloster, K., Gleich, D.F.: Heat kernel based community detection. In: ACM SIGKDD, pp. 1386–1395. ACM (2014)
Leskovec, J., Lang, K.J., Dasgupta, A., Mahoney, M.W.: Statistical properties of community structure in large social and information networks. In: WWW, pp. 695–704. ACM (2008)
Liao, C.S., Lu, K., Baym, M., Singh, R., Berger, B.: Isorankn: spectral methods for global alignment of multiple protein networks. Bioinformatics 25(12), i253–i258 (2009)
Lovász, L., Simonovits, M.: The mixing rate of markov chains, an isoperimetric inequality, and computing the volume. In: FOCS, pp. 346–354. IEEE (1990)
Lovász, L., Simonovits, M.: Random walks in a convex body and an improved volume algorithm. Random Struct. Algorithms 4(4), 359–412 (1993)
Low, Y., Gonzalez, J., Kyrola, A., Bickson, D., Guestrin, C., Hellerstein, J.M.: Graphlab: a new framework for parallel machine learning. In: UAI, pp. 340–349 (2010)
Malewicz, G., Austern, M.H., Bik, A.J., Dehnert, J.C., Horn, I., Leiser, N., Czajkowski, G.: Pregel: a system for large-scale graph processing. In: SIGMOD International Conference on Management of data, pp. 135–146. ACM (2010)
Orecchia, L., Sachdeva, S., Vishnoi, N.K.: Approximating the exponential, the lanczos method and an \(\tilde{O}\)(m)-time spectral algorithm for balanced separator. In: STOC, pp. 1141–1160. ACM (2012)
Pandurangan, G., Khan, M.: Theory of communication networks. In: Atallah, M.J., Blanton, M. (eds.) Algorithms and Theory of Computation Handbook. Chapman & Hall/CRC, Boca Raton (2010)
Peleg, D.: Distributed computing. In: SIAM Monographs on Discrete Mathematics and Applications 5 (2000)
Spielman, D.A., Teng, S.H.: Nearly-linear time algorithms for graph partitioning, graph sparsification, and solving linear systems. In: STOC, pp. 81–90. ACM (2004)
Acknowledgements
The authors would like to warmly thank Yiannis Koutis for discussion and for suggesting the problem of finding efficient distributed algorithms, as well as the anonymous reviewers for their suggestions for improving the paper.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer International Publishing Switzerland
About this paper
Cite this paper
Chung, F., Simpson, O. (2015). Distributed Algorithms for Finding Local Clusters Using Heat Kernel Pagerank. In: Gleich, D., Komjáthy, J., Litvak, N. (eds) Algorithms and Models for the Web Graph. WAW 2015. Lecture Notes in Computer Science(), vol 9479. Springer, Cham. https://doi.org/10.1007/978-3-319-26784-5_14
Download citation
DOI: https://doi.org/10.1007/978-3-319-26784-5_14
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-26783-8
Online ISBN: 978-3-319-26784-5
eBook Packages: Computer ScienceComputer Science (R0)