Abstract
Collocation analysis finds semantic associations of concepts using large text corpora. If the same procedure is applied to sets of outgoing links of web pages, we can find semantically related web domains to a large extent. The structure of the semantic clusters shows all properties of small worlds. The algorithm is known to work for large parts of the web like the German internet. As a sample application we present a surf guide for the German web.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Barabasi, A.L., et al.: Scale-free characteristics of random networks: the topology of the World-wide web. Physica A (281), 70–77 (2000)
Brinkmeier, M.: Communities in Graphs. In: Böhme, T., Heyer, G., Unger, H. (eds.) Innovative Internet Community Systems, Proceedings of the Third International Workshop I2CS 2003, Leipzig, pp. 20–35. Springer, Heidelberg (2003)
Davidson, R., Harel, D.: Drawing graphs nicely using simulated annealing. ACM Transactions on Graphics 15(4), 301–331 (1996)
Deo, N., Gupta, P.: World Wide web: a Graph Theoretic Approach. Technical Report CS TR-01-001, University of Central Florida, Orlando Fl. USA (2001)
Gibson, D., Kleinberg, J., Raghavan, P.: Inferring Web Communities from Link Topology. In: Proceedings of the 9th ACM Conference on Hypertext and Hypermedia, Pittsburgh, Pennsylvania, pp. 225–234 (1998)
NEDLIB Harvester, http://www.csc.fi/sovellus/nedlib/
Quasthoff, U.: Chr. Wolff. The Poisson Collocation Measure and its Applications. In: Proc. Second International Workshop on Computational Approaches to Collocations, Wien (Juli 2002)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2006 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Heyer, G., Quasthoff, U. (2006). Calculating Communities by Link Analysis of URLs. In: Böhme, T., Larios Rosillo, V.M., Unger, H., Unger, H. (eds) Innovative Internet Community Systems. IICS 2004. Lecture Notes in Computer Science, vol 3473. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11553762_15
Download citation
DOI: https://doi.org/10.1007/11553762_15
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-28880-0
Online ISBN: 978-3-540-33995-3
eBook Packages: Computer ScienceComputer Science (R0)