Abstract
PageRank inherently is massively parallelizable and distributable, as a result of web’s strict host-based link locality. We show that the Gauß-Seidel iterative method can actually be applied in such a parallel ranking scenario in order to improve convergence. By introducing a two-dimensional web model and by adapting the PageRank to this environment, we present efficient methods to compute the exact rank vector even for large-scale web graphs in only a few minutes and iteration steps, with intrinsic support for incremental web crawling, and without the need for page sorting/reordering or for sharing global rank information.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Arasu, A., Novak, J., Tomkins, A., Tomlin, J.: PageRank computation and the structure of the web: Experiments and algorithms (2001)
Bharat, K., Chang, B.-W., Henzinger, M.R., Ruhl, M.: Who links to whom: Mining linkage between web sites. In: Proc. of the IEEE Intl. Conf. on Data Mining, pp. 51–58 (2001)
Brin, S., Motwani, R., Page, L., Winogradp, T.: What can you do with a web in your pocket? Data Engineering Bulletin 21(2), 37–47 (1998)
Broder, A.Z., Lempel, R., Maghoul, F., Pedersen, J.: Efficient pagerank approximation via graph aggregation. In: Proc. of the 13th International World Wide Web Conference, pp. 484–485 (2004)
Chen, Y.-Y., Gan, Q., Suel, T.: I/o-efficient techniques for computing pagerank (2002)
Cho, J., Garcia-Molina, H.: The evolution of the web and implications for an incremental crawler. In: Proceedings of the 26th International Conference on Very Large Databases (2000)
Eiron, N., McCurley, K.S., Tomlin, J.A.: Ranking the web frontier. In: Proc. of the 13th Intl. Conf. on the World Wide Web, pp. 309–318 (2004)
Gleich, D., Zhukov, L., Berkhin, P.: Fast parallel PageRank: A linear system approach. Technical report, Yahoo! Research Labs (2004)
Haveliwala, T.H.: Efficient computation of PageRank. Technical Report 1999-31, Stanford Library Technologies Project (1999)
Kamvar, S., Haveliwala, T., Manning, C., Golub, G.: Exploiting the block structure of the web for computing PageRank. Technical report, Stanford University (2003)
Kamvar, S.D., Haveliwala, T.H., Golub, G.H.: Adaptive methods for the computation of PageRank. Technical report, Stanford University (2003)
Kamvar, S.D., Haveliwala, T.H., Manning, C.D., Golub, G.H.: Extrapolation methods for accelerating PageRank computations. In: Proc. of the 12th Intl. Conf. on the World Wide Web, pp. 261–270 (2003)
Kim, S.J., Lee, S.H.: An improved computation of the pageRank algorithm. In: Crestani, F., Girolami, M., van Rijsbergen, C.J.K. (eds.) ECIR 2002. LNCS, vol. 2291, p. 73. Springer, Heidelberg (2002)
Koester, D.P., Ranka, S., Fox, G.C.: A parallel gauss-seidel algorithm for sparse power system matrices. In: Proc. of the ACM/IEEE Conf. on Supercomputing, pp. 184–193 (1994)
Langville, A.N., Meyer, C.D.: Deeper inside PageRank (2004)
Lee, C.P., Golub, G.H., Zenios, S.A.: A fast two-stage algorithm for computing PageRank. Technical report, Stanford University (2003)
Manaskasemsak, B., Rungsawang, A.: Parallel PageRank computation on a gigabit pc cluster. In: Proc. of the 18th International Conference on Advanced Information Networking and Application (AINA 2004) (2004)
McSherry, F.: A uniform approach to accelerated pagerank computation. In: Proc. of the 14th international conference on World Wide Web, pp. 575–582. ACM Press, New York (2005)
Netcraft. Web server survey (2005)
Page, L., Brin, S., Motwani, R., Winograd, T.: The PageRank citation ranking: Bringing order to the web. Technical report, Stanford Digital Library Technologies Project (1998)
Sankaralingam, K., Sethumadhavan, S., Browne, J.C.: Distributed pagerank for p2p systems. In: Proc. of the 12th IEEE Intl. Symp. on High Performance Distributed Computing (HPDC), p. 58 (2003)
Shi, S.-M., Yu, J., Yang, G.-W., Wang, D.-X.: Distributed page ranking in structured p2p networks. In: Proc. of the 2003 International Conference on Parallel Processing (ICPP 2003), pp. 179–186 (2003)
Haveliwala, T.H., et al.: 2001 Crawl of the WebBase project (2001)
Wang, Y., DeWitt, D.J.: Computing PageRank in a distributed internet search system. In: Proceedings of the 30th VLDB Conference (2004)
Wu, J., Aberer, K.: Using SiteRank for P2P Web Retrieval (March 2004)
Zhu, Y., Ye, S., Li, X.: Distributed pagerank computation based on iterative aggregation-disaggregation methods. In: Proc. of the 14th ACM international conference on Information and knowledge management (2005)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2006 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Kohlschütter, C., Chirita, PA., Nejdl, W. (2006). Efficient Parallel Computation of PageRank. In: Lalmas, M., MacFarlane, A., Rüger, S., Tombros, A., Tsikrika, T., Yavlinsky, A. (eds) Advances in Information Retrieval. ECIR 2006. Lecture Notes in Computer Science, vol 3936. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11735106_22
Download citation
DOI: https://doi.org/10.1007/11735106_22
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-33347-0
Online ISBN: 978-3-540-33348-7
eBook Packages: Computer ScienceComputer Science (R0)