Abstract
Over the last decade, PageRank has gained importance in a wide range of applications and domains, ever since it first proved to be effective in determining node importance in large graphs (and was a pioneering idea behind Google’s search engine). In distributed computing alone, PageRank vectors, or more generally random walk based quantities have been used for several different applications ranging from determining important nodes, load balancing, search, and identifying connectivity structures. Surprisingly, however, there has been little work towards designing provably efficient fully-distributed algorithms for computing PageRank. The difficulty is that traditional matrix-vector multiplication style iterative methods may not always adapt well to the distributed setting owing to communication bandwidth restrictions and convergence rates.
In this paper, we present fast random walk-based distributed algorithms for computing PageRank in general graphs and prove strong bounds on the round complexity. We first present an algorithm that takes O(logn/ε) rounds with high probability on any graph (directed or undirected), where n is the network size and ε is the reset probability used in the PageRank computation (typically ε is a fixed constant). We then present a faster algorithm that takes \(O(\sqrt{\log n}/{\epsilon})\) rounds in undirected graphs. Both of the above algorithms are scalable, as each node processes and sends only small (polylogarithmic in n, the network size) number of bits per round and hence work in the CONGEST distributed computing model. For directed graphs, we present an algorithm that has a running time of \(O(\sqrt{\log n/{\epsilon}})\), but it requires a polynomial number of bits to processed and sent per node in a round. To the best of our knowledge, these are the first fully distributed algorithms for computing PageRank vectors with provably efficient running time.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Andersen, R., Chung, F., Lang, K.: Local graph partitioning using pagerank vectors. In: FOCS, pp. 475–486 (2006)
Avrachenkov, K., Litvak, N., Nemirovsky, D., Osipova, N.: Monte carlo methods in pagerank computation: When one iteration is sufficient. SIAM J. Number. Anal. 45(2), 890–904 (2007)
Bahmani, B., Chakrabarti, K., Xin, D.: Fast personalized pagerank on mapreduce. In: SIGMOD Conference, pp. 973–984 (2011)
Bahmani, B., Chowdhury, A., Goel, A.: Fast incremental and personalized pagerank. PVLDB 4, 173–184 (2010)
Berkhin, P.: A survey on pagerank computing. Internet Mathematics 2(1), 73–120 (2005)
Bianchini, M., Gori, M., Scarselli, F.: Inside pagerank. ACM Trans. Internet Technol. 5(1), 92–128 (2005)
Brin, S., Page, L.: The anatomy of a large-scale hypertextual web search engine. In: Seventh International World-Wide Web Conference (WWW 1998), pp. 107–117 (1998)
Cook, M.: Calculation of pagerank over a peer-to-peer network (2004)
Das Sarma, A., Nanongkai, D., Pandurangan, G., Tetali, P.: Efficient distributed random walks with applications. In: PODC, pp. 201–210 (2010)
Grolmusz, V.: A note on the pagerank of undirected graphs. CoRR, abs/1205.1960 (2012)
Iván, G., Grolmusz, V.: When the web meets the cell: using personalized pagerank for analyzing protein interaction networks. Bioinformatics 27(3), 405–407 (2011)
Langville, A.N., Meyer, C.D.: Survey: Deeper inside pagerank. Internet Mathematics 1(3), 335–380 (2003)
Mitzenmacher, M., Upfal, E.: Probability and Computing: Randomized Algorithms and Probabilistic Analysis. Cambridge University Press, New York (2005)
Page, L., Brin, S., Motwani, R., Winograd, T.: The pagerank citation ranking: Bringing order to the web. Technical report, Stanford InfoLab (1999)
Pandurangan, G., Khan, M.: Theory of communication networks. In: Algorithms and Theory of Computation Handbook, 2nd edn. CRC Press (2009)
Peleg, D.: Distributed computing: a locality-sensitive approach. SIAM, Philadelphia (2000)
Perra, N., Fortunato, S.: Spectral centrality measures in complex networks. Phys. Rev. E 78, 36107 (2008)
Sankaralingam, K., Sethumadhavan, S., Browne, J.C.: Distributed pagerank for p2p systems. In: Proceedings of the 12th International Symposium on High Performance Distributed Computing, pp. 58–68 (June 2003)
Sarma, A.D., Gollapudi, S., Panigrahy, R.: Estimating pagerank on graph streams. In: PODS, pp. 69–78. ACM (2008)
Shi, S., Yu, J., Yang, G., Wang, D.: Distributed page ranking in structured p2p networks. In: Proceedings of the 2003 International Conference on Parallel Processing (2003)
Wang, J., Liu, J., Wang, C.: Keyword Extraction Based on PageRank. In: Zhou, Z.-H., Li, H., Yang, Q. (eds.) PAKDD 2007. LNCS (LNAI), vol. 4426, pp. 857–864. Springer, Heidelberg (2007)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Das Sarma, A., Molla, A.R., Pandurangan, G., Upfal, E. (2013). Fast Distributed PageRank Computation. In: Frey, D., Raynal, M., Sarkar, S., Shyamasundar, R.K., Sinha, P. (eds) Distributed Computing and Networking. ICDCN 2013. Lecture Notes in Computer Science, vol 7730. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-35668-1_2
Download citation
DOI: https://doi.org/10.1007/978-3-642-35668-1_2
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-35667-4
Online ISBN: 978-3-642-35668-1
eBook Packages: Computer ScienceComputer Science (R0)