Abstract
Recent work on modeling the Web graph has dwelt on capturing the degree distributions observed on the Web. Pointing out that this represents a heavy reliance on “local” properties of the Web graph, we study the distribution of PageRank values (used in the Google search engine) on the Web. This distribution is of independent interest in optimizing search indices and storage. We show that PageRank values on the Web follow a power law. We then develop detailed models for the Web graph that explain this observation, and moreover remain faithful to previously studied degree distributions. We analyze these models, and compare the analyses to both snapshots from the Web and to graphs generated by simulations on the new models. To our knowledge this represents the first modeling of the Web that goes beyond fitting degree distributions on the Web.
Supported in part by NSF grant CCR-9731477 and NSF ITR grant CCR-0121154.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
L. Adamic and B. Huberman. Power Law distribution of the World Wide Web, Technical Comment on [3], Science, 287, 2000, 2115a.
Arvind Arasu, Junghoo Cho, Hector Garcia-Molina, Andreas Paepcke, Sriram Raghavan. Searching the Web. ACM Transactions on Internet Technology, 1(1), 2001, 2–43.
A. Barabasi and R. Albert. Emergence of Scaling in Random Networks. Science, 286(509), 1999.
A. Barabasi, R. Albert and H. Jeong. Mean-field theory for scale-free random graphs. Physica A, 272, 1999, 173–187.
B. Bollobas. Random Graphs. Academic Press, 1990.
B. Bollobas, O. Riordan, J. Spencer, and G. Tusnady. The degree sequence of a scale-free random graph process. Random Structures and Algorithms, 18(3), 2001, 279–290.
B. Bollobas and O. Riordan. The diameter of a scale-free random graph. preprint, 2001.
S. Brin and L. Page. The anatomy of a large-scale hypertexual Web search engine. In Proceedings of the 7th WWW conference, 1998.
A. Broder, R. Kumar, F. Maghoul, P. Raghavan, S. Rajagopalan, R. Stata, Andrew Tomkins, J. Weiner. Graph Structure in the Web. In Proceedings of the 9th WWW Conference, 2000.
S. Dill, R. Kumar, K. McCurley, S. Rajagopalan, D. Sivakumar, and A. Tomkins. Self-Similarity in the Web. In Proceedings of the 27th International Conference on Very Large Databases (VLDB), 2001.
D. Gibson, J.M. Kleinberg and P. Raghavan. Inferring Web communities from link topology. In Proceedings of the ACM Symposium on Hypertext and Hypermedia, 1998.
Google Inc. http://www.google.com
J. Kleinberg, S. Ravi Kumar, P. Raghavan, S. Rajagopalan and A. Tomkins. The Web as a graph: measurements, models and methods. In Proceedings of the 5th Annual International Computing and Combinatorics Conference (COCOON), 1999.
R. Kumar, P. Raghavan, S. Rajagopalan, and A. Tomkins. Trawling the Web for Emerging Cyber-Communities. In Proceedings of the 8th WWW Conference, 1999, 403–416.
R. Kumar, P. Raghavan, S. Rajagopalan, D. Sivakumar, A. Tomkins, and E. Upfal. Stochastic Models for the Web. In Proceedings of the 41st Annual Symposium on the Foundations of Computer Science (FOCS), 2000.
R. Motwani and P. Raghavan. Randomized Algorithms, Cambridge University Press, 1995.
L. Page, S. Brin, R. Motwani, and T. Winograd. The PageRank Citation Ranking: Bringing order to the Web, Technical Report, Computer Science Department, Stanford University, 1998.
WT10g collection draft paper. http://www.ted.cmis.csiro.au/TRECWeb/wt10ginfo.ps.gz
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2002 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Pandurangan, G., Raghavan, P., Upfal, E. (2002). Using PageRank to Characterize Web Structure. In: Ibarra, O.H., Zhang, L. (eds) Computing and Combinatorics. COCOON 2002. Lecture Notes in Computer Science, vol 2387. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-45655-4_36
Download citation
DOI: https://doi.org/10.1007/3-540-45655-4_36
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-43996-7
Online ISBN: 978-3-540-45655-1
eBook Packages: Springer Book Archive