Abstract
The subject of this survey is the directed graph induced by the hyperlinks between Web pages; we refer to this as the Web graph. Nodes represent static html pages and hyperlinks represent directed edges between them. Recent estimates [5] suggest that there are several hundred million nodes in the Web graph; this quantity is growing by several percent each month. The average node has roughly seven hyperlinks (directed edges) to other pages, making for a total of several billion hyperlinks in all.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Abiteboul, S., Quass, D., McHugh, J., Widom, J., Wiener, J.: The Lorel Query language for semistructured data. Intl. J. on Digital Libraries 1(1), 68–88 (1997)
Agrawal, R., Srikanth, R.: Fast algorithms for mining association rules. In: Proc. VLDB (1994)
Aiello, W., Chung, F., Lu, L.: A random graph model for massive graphs. To appear in the Proceedings of the ACM Symposium on Theory of Computing (2000)
Arocena, G.O., Mendelzon, A.O., Mihaila, G.A.: Applications of a Web query language. In: Proc. 6th WWW Conf. (1997)
Bharat, K., Broder, A.: A technique for measuring the relative size and overlap of public Web search engines. In: Proc. 7th WWW Conf. (1998)
Bharat, K., Henzinger, M.R.: Improved algorithms for topic distillation in a hyperlinked environment. In: Proc. ACM SIGIR (1998)
Brin, S., Page, L.: The anatomy of a large-scale hypertextual Web search engine. In: Proc. 7th WWW Conf. (1998), See also http://www.google.com
Broder, A.Z., Kumar, S.R., Maghoul, F., Raghavan, P., Rajagopalan, S., Stata, R., Tomkins, A., Wiener, J.: Graph structure in the web: experiments and models (submitted for publication)
Bollobás, B.: Random Graphs. Academic Press, London (1985)
Carrière, J., Kazman, R.: WebQuery: Searching and visualizing theWeb through connectivity. In: Proc. 6th WWW Conf. (1997)
Chakrabarti, S., Dom, B., Gibson, D., Kleinberg, J., Raghavan, P., Rajagopalan, S.: Automatic resource compilation by analyzing hyperlink structure and associated text. In: Proc. 7th WWW Conf. (1998)
Chakrabarti, S., Dom, B., Kumar, S.R., Raghavan, P., Rajagopalan, S., Tomkins, A.: Experiments in topic distillation. In: SIGIR workshop on hypertext IR (1998)
Chakrabarti, S., Dom, B., Indyk, P.: Enhanced hypertext classification using hyperlinks. In: Proc. ACM SIGMOD (1998)
Charikar, M., Kumar, S.R., Raghavan, P., Rajagopalan, S., Tomkins, A.: On targeting Markov segments. In: Proc. ACM Symposium on Theory of Computing (1999)
Davis, H.T.: The Analysis of Economic Time Series. Principia press (1941)
Downey, R., Fellows, M.: Parametrized Computational Feasibility. In: Clote, P., Remmel, J. (eds.) Feasible Mathematics II, Birkhauser, Basel (1994)
Egghe, L., Rousseau, R.: Introduction to Informetrics. Elsevier, Amsterdam (1990)
Fagin, R., Karlin, A., Kleinberg, J., Raghavan, P., Rajagopalan, S., Rubinfeld, R., Sudan, M., Tomkins, A.: Random walks with“back buttons”. To appear in the Proceedings of the ACM Symposium on Theory of Computing (2000)
Florescu, D., Levy, A., Mendelzon, A.: Database techniques for the World Wide Web: A survey. SIGMOD Record 27(3), 59–74 (1998)
Garfield, E.: Citation analysis as a tool in journal evaluation. Science 178, 471–479 (1972)
Gilbert, N.: A simulation of the structure of academic science. Sociological Research Online 2(2) (1997)
Golub, G., Van Loan, C.F.: Matrix Computations. Johns Hopkins University Press, Baltimore (1989)
Henzinger, M.R., Raghavan, P., Rajagopalan, S.: Computing on data streams. AMS-DIMACS series, special issue on computing on very large datasets (1998)
Kessler, M.M.: Bibliographic coupling between scientific papers. American Documentation 14, 10–25 (1963)
Kleinberg, J.: Authoritative sources in a hyperlinked environment, J. of the ACM (1999) (to appear); Also appears as IBM Research Report RJ 10076(91892) (May 1997)
Kleinberg, J., Ravi Kumar, S., Raghavan, P., Rajagopalan, S., Tomkins, A.: The Web as a graph: measurements, models and methods. In: Asano, T., Imai, H., Lee, D.T., Nakano, S.-i., Tokuyama, T. (eds.) COCOON 1999. LNCS, vol. 1627, p. 1. Springer, Heidelberg (1999)
Konopnicki, D., Shmueli, O.: Information gathering on the World Wide Web: the W3QL query language and the W3QS system. Trans. on Database Systems (1998)
Kumar, S.R., Raghavan, P., Rajagopalan, S., Tomkins, A.: Trawling emerging cyber-communities automatically. In: Proc. 8th WWW Conf. (1999)
Lakshmanan, L.V.S., Sadri, F., Subramanian, I.N.: A declarative approach to querying and restructuring the World Wide Web. In: Post-ICDE Workshop on RIDE (1996)
Larson, R.: Bibliometrics of the World Wide Web: An exploratory analysis of the intellectual structure of cyberspace. Ann. Meeting of the American Soc. Info. Sci. (1996)
Lotka, A.J.: The frequency distribution of scientific productivity. J. of the Washington Acad. of Sci. 16, 317 (1926)
Mendelzon, A., Mihaila, G., Milo, T.: Querying the World Wide Web. J. of Digital Libraries 1(1), 68–88 (1997)
Mendelzon, A., Wood, P.: Finding regular simple paths in graph databases. SIAM J. Comp. 24(6), 1235–1258 (1995)
Spertus, E.: ParaSite: Mining structural information on the Web. In: Proc. 6th WWW Conf. (1997)
Zipf, G.K.: Human behavior and the principle of least effort. Hafner, New York (1949)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2000 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Raghavan, P. (2000). Graph Structure of the Web: A Survey. In: Gonnet, G.H., Viola, A. (eds) LATIN 2000: Theoretical Informatics. LATIN 2000. Lecture Notes in Computer Science, vol 1776. Springer, Berlin, Heidelberg. https://doi.org/10.1007/10719839_13
Download citation
DOI: https://doi.org/10.1007/10719839_13
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-67306-4
Online ISBN: 978-3-540-46415-0
eBook Packages: Springer Book Archive