Skip to main content

Level-Biased Statistics in the Hierarchical Structure of the Web

  • Conference paper
Advances in Knowledge Discovery and Data Mining (PAKDD 2006)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 3918))

Included in the following conference series:

  • 3006 Accesses

Abstract

In the literature of web search and mining, researchers used to consider the World Wide Web as a flat network, in which each page as well as each hyperlink is treated identically. However, it is the common knowledge that the Web is organized with a natural hierarchical structure according to the URLs of pages. Exploring the hierarchical structure, we found several level-biased characteristics of the Web. First, the distribution of pages over levels has a spindle shape. Second, the average indegree in each level decreases sharply when the level goes down. Third, although the indegree distributions in deeper levels obey the same power law with the global indegree distribution, the top levels show a quite different statistical characteristic. We believe that these new discoveries might be essential to the Web, and by taking use of them, the current web search and mining technologies could be improved and thus better services to the web users could be provided.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Barabási, A.-L., Albert, R.: Emergence of scaling in random networks. Science 286, 509–512 (1999)

    Article  MathSciNet  MATH  Google Scholar 

  2. Brin, S., Page, L., Motwami, R., Winograd, T.: The PageRank citation ranking: bring order to the web. Technical report, Computer Science Department, Stanford University (1998)

    Google Scholar 

  3. Broder, A.Z., Kumar, S.R., Maghoul, F., Raghavan, P., Rajagopalan, S., Stata, R., Tomkins, A., Wiener, J.: Graph structure in the web: experiments and models. In: Proc. of the 9th WWW Conference, pp. 309–320 (2000)

    Google Scholar 

  4. Chung, F., Handjani, S., Jungreis, D.: Generalizations of Polya’s urn problem. Annals of Combinatorics 7, 141–153 (2003)

    Article  MathSciNet  MATH  Google Scholar 

  5. Eiron, N., McCurley, K.: Link structure of hierarchical information networks. In: Proc. Third Workshop on Algorithms and Models for the Web-Graph (2004)

    Google Scholar 

  6. Feng, G., Liu, T.-Y., Zhang, X.-D., Qin., T., Gao, B., Ma, W.-Y.: Level-based link analysis. In: Zhang, Y., Tanaka, K., Yu, J.X., Wang, S., Li, M. (eds.) APWeb 2005. LNCS, vol. 3399, pp. 183–194. Springer, Heidelberg (2005)

    Chapter  Google Scholar 

  7. Kleinberg, J.: Authoritative sources in a hyperlinked environment. Journal of the ACM 46(5), 604–622 (1999)

    Article  MathSciNet  MATH  Google Scholar 

  8. Klemm, K., Eguiluz, V.M.: Highly clustered scale-free networks. Phys. Rev. E 65, 036123 (2002)

    Google Scholar 

  9. Laura, L., Leonardi, S., Caldarelli, G., Rios, P.D.L.: A multi-layer model for the web graph. In: 2nd International Workshop on Web Dynamics, Honolulu (2002)

    Google Scholar 

  10. Newman, M.E.J.: The structure and function of complex networks. SIAM Review 45, 167–256 (2003)

    Article  MathSciNet  MATH  Google Scholar 

  11. Pennock, D.M., Flake, G.W., Lawrence, S., Giles, C.L., Glover, E.J.: Winners don’t take all: Characterizing the competition for links on the Web. In: Proceedings of the National Academy of Sciences (2002)

    Google Scholar 

  12. Ravasz, E., Barabasi, A.-L.: Hierarchical organization in complex networks. Phys. Rev. E 67, 026112 (2003)

    Google Scholar 

  13. Simon, H.A.: The Sciences of the Artifical, 3rd edn. MIT Press, Cambridge (1981)

    Google Scholar 

  14. Watts, D.J., Strogatz, S.H.: Collective dynamics of ‘small world’ networks. Nature 393, 440–442 (1998)

    Article  MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2006 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Feng, G., Liu, TY., Zhang, XD., Ma, WY. (2006). Level-Biased Statistics in the Hierarchical Structure of the Web. In: Ng, WK., Kitsuregawa, M., Li, J., Chang, K. (eds) Advances in Knowledge Discovery and Data Mining. PAKDD 2006. Lecture Notes in Computer Science(), vol 3918. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11731139_37

Download citation

  • DOI: https://doi.org/10.1007/11731139_37

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-33206-0

  • Online ISBN: 978-3-540-33207-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics