Abstract
We provide evidence that the inherent hierarchical structure of the web is closely related to the link structure. Moreover, we show that this relationship explains several important features of the web, including the locality and bidirectionality of hyperlinks, and the compressibility of the web graph. We describe how to construct data models of the web that capture both the hierarchical nature of the web as well as some crucial features of the link graph.
An extended version is available at http://www.mccurley.org/papers/entropy.pdf
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Simon, H.A.: The Sciences of the Artifical, 3rd edn. MIT Press, Cambridge (1981)
Newman, M.E.J.: The structure and function of complex networks. SIAM Review 45, 167–256 (2003)
Kumar, R., Raghavan, P., Rajagopalan, S., Sivakumar, D.: Stochastic models for the Web graph. In: Proc. of the 41st IEEE Symposium on Foundations of Comp. Sci., pp. 57–65 (2000)
Laura, L., Leonardi, S., Caldarelli, G., Rios, P.D.L.: A multi-layer model for the web graph. In: 2nd International Workshop on Web Dynamics, Honolulu (2002)
Ravasz, E., Barabási, A.-L.: Hierarchical organization in complex networks. Phys. Rev. E 67 (2003)
Chakrabarti, D., Zhan, Y., Faloutsos, C.: R-MAT: A recursive model for graph mining. In: Proc. SIAM Int. Conf. on Data Mining (2004)
Huberman, B.A., Adamic, L.A.: Evolutionary dynamics of the world wide web. Technical report, XEROX PARC (1999)
Mitzenmacher, M.: A brief history of generative models for power law and lognormal distributions. Internet Mathematics 1 (to appear, 2003)
Dill, S., Kumar, R., McCurley, K.S., Rajagopalan, S., Sivakumar, D., Tomkins, A.: Self-similarity in the web. ACM Transactions on Internet Technology 2, 205–223 (2002)
Smythe, R.T., Mahmoud, H.M.: A survey of recursive trees. Theoretical Probability and Mathematical Statistics 51, 1–27 (1995); Translation from, Theorya Imovirnosty ta Matemika Statystika 51, 1–29 (1994)
Balińska, K.T., Quintas, L.V., Szymański, J.: Random recursive forests. Random Structures and Algorithms 5, 3–12 (1994)
Mitzenmacher, M.: Dynamic models for file sizes and double pareto distributions. Internet Mathematics (2004)
Eiron, N., McCurley, K.S.: Untangling compound documents in the web. In: Proc. ACM Conf. on Hypertext and Hypermedia (2003)
Boldi, P., Vigna, S.: The webgraph framework I: Compression techniques. In: Proc. Int. WWW Conf., New York (2004)
Adler, M., Mitzenmacher, M.: Towards compressing web graphs. Technical report, Harvard University Computer Science Dept (2001), Short version in Data Compression Conference (2001)
Randall, K.H., Stata, R., Wickremesinghe, R.G., Wiener, J.L.: The link database: Fast access to graphs of the Web. In: Proceedings of the 2002 Data Compression Conference (DCC), pp. 122–131 (2002)
Levene, M., Fenner, T., Loizou, G., Wheeldon, R.: A stochastic model for the evolution of the web. Computer Networks 39, 277–287 (2002)
Pennock, D.M., Flake, G.W., Lawrence, S., Glover, E.J., Giles, C.L.: Winners don’t take all: Characterizing the competition for links on the web. In: PNAS, pp. 5207–5211 (2002)
Kumar, R., Raghavan, P., Rajagopalan, S., Tomkins, A.: Extracting large-scale knowledge bases from the Web. In: Atkinson, M.P., Orlowska, M.E., Valduriez, P., Zdonik, S.B., Brodie, M.L. (eds.) Proc. 25th VLDB, Edinburgh, Scotland, pp. 639–650. Morgan Kaufmann, San Francisco (1999)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2004 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Eiron, N., McCurley, K.S. (2004). Links in Hierarchical Information Networks. In: Leonardi, S. (eds) Algorithms and Models for the Web-Graph. WAW 2004. Lecture Notes in Computer Science, vol 3243. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-30216-2_12
Download citation
DOI: https://doi.org/10.1007/978-3-540-30216-2_12
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-23427-2
Online ISBN: 978-3-540-30216-2
eBook Packages: Springer Book Archive