Abstract
A technique is presented for the identification of patterns from the links between large Web spaces and is applied to data concerning the interlinking of university Web sites in fifteen European countries. This is based upon a procedure for normalising the data so that it can be analysed using standard multivariate statistical techniques and is less susceptible to individual outliers than standard methods. The approach was successfully able to identify clusters of European countries based upon data for their universities' interlinking patterns. For example, the northern countries were differentiated from the southern with this method.
Similar content being viewed by others
References
AGRESTI, A. (1996), An Introduction to Categorical Data Analysis. London: Wiley.
AGUILLO, I. F. (1998), STM information on the Web and the development of new Internet R&D databases and indicators. In: Online Information 98: Proceedings . Learned Information, 1998. pp. 239-243.
AHLGREN, P., JARNEVING, B., ROUSSEAU, R. (2003), Requirements for a cocitation similarity measure, with special reference to Pearson.s correlation coefficient. Journal of the American Society for Information Science and Technology, 54 (6): 550-560.
ALMIND, T. C., INGWERSEN, P. (1997), Informetric analysis on the World Wide Web: methodological approaches to webometrics. Journal of Documentation, 53 (4): 404-426.
BAR-ILAN, J. (2001), Data collection methods on the Web for informetric purposes-A review and analysis. Scientometrics, 50 (1): 7-32.
BORGMAN, C., FURNER, J. (2002), Scholarly communication and bibliometrics. In: CRONIN, B. (Ed.), Annual Review of Information Science and Technology 36, Medford, NJ: Information Today Inc., pp. 3-72.
BOUDOURIDES, M., ANTYPAS, G. (2002), A simulation of the structure of the World-Wide Web, Sociological Research Online, 7(1). Available: http://www.socresonline.org.uk/7/1/boudourides.html
BRUNN, S. D., DODGE, M. (2001), Mapping the "worlds" of the world wide Web: (Re)Structuring global commerce through hyperlinks, American Behavioral Scientist, 44 (10): 1717-1739.
CRONIN, B. (2001), Bibliometrics and Beyond: Some thoughts on web-based citation analysis. Journal of Information Science, 27 (1): 1-7.
FLAKE, G. W., LAWRENCE, S., GILES, C. L., COETZEE, F. M. (2002), Self-organization and identification of Web communities, IEEE Computer, 35: 66-71.
GARRIDO, M., HALAVAIS, A. (2003), Mapping networks of support for the Zapatista movement: Applying Social Network Analysis to study contemporary social movements. In: M. MCCAUGHEY, M. AYERS (Eds), Cyberactivism: Online Activism in Theory and Practice. New York: Routledge, pp. 165-184.
GLäNZEL, W., SCHUBERT, A. (2001), Double effort = double impact? A critical view at international co-authorship in chemistry, Scientometrics, 50 (2): 199-214.
GORDON, A. D. (1999), Classification. 2nd Ed. Chapman Hall.
HAVELIWALA, T. H., GIONIS, A., INDYK P. (2000), Scalable techniques for clustering the Web. In: WebDB 2000. Available: http://www.research.att.com/conf/Webdb2000/program.html
HUBERT, L. J., SCHULTZ, J. (1976), Quadratic assignment as a general data analysis strategy. British Journal of Mathematical and Statistical Psychology, 29: 190-241.
KLEINBERG, J. (1999), Authoritative sources in a hyperlinked environment, Journal of the ACM, 46 (5): 604-632.
KRACKHARDT, D. (1992), A caveat on the use of the Quadratic Assignment Procedure. Journal of Quantitative Anthropology, 3: 279-296.
KRUSKAL, J. B., WISH, M. (1978), Multidimensional Scaling. Sage.
PARK, H. W., BARNETT, G. A., NAM, I. (2002), Hyperlink-affiliation network structure of top Web sites: Examining affiliates with hyperlink in Korea. Journal of the American Society for Information Science, 53 (7): 592-601.
ROGERS, R. (2002), Operating issue networks on the Web, Science as Culture, 11 (2): 191-214.
ROUSSEAU, R. (1997), Sitations, an exploratory study, Cybermetrics, 1. Available: http://www.cindoc.csic.es/cybermetrics/articles/v1i1p1.html
ROUSSEAU, R. (1999), Daily time series of common single word searches in AltaVista and NorthernLight, Cybermetrics, 2/3. Available: http://www.cindoc.csic.es/cybermetrics/articles/v2i1p2.html
SMITH, A., THELWALL, M. (2002), Web Impact Factors for Australasian Universities, Scientometrics, 54 (3): 363-380.
THELWALL, M., SMITH, A. (2002), A study of the interlinking between Asia-Pacific University Web sites, Scientometrics 55 (3): 363-376.
THELWALL, M., WILKINSON, D. (2003), Three target document range metrics for university Web sites. Journal of the American Society for Information Science and Technology, 54 (6): 489-496.
THELWALL, M. (2001a), Extracting macroscopic information from web links, Journal of the American Society for Information Science and Technology, 52 (13): 1157-1168.
THELWALL, M. (2001b), Exploring the link structure of the Web with network diagrams, Journal of Information Science, 27 (6): 393-402.
THELWALL, M. (2001c), The responsiveness of search engine indexes, Cybermetrics, 5(1). Available: http://www.cindoc.csic.es/cybermetrics/articles/v5i1p1.html
THELWALL, M. (2002a), Conceptualizing documentation on the Web: an evaluation of different heuristic-based models for counting links between university web sites, Journal of the American Society for Information Science and Technology, 53 (12): 995-1005.
THELWALL, M. (2002b), An initial exploration of the link relationship between UK university web sites, ASLIB Proceedings, 54 (2): 118-126.
THELWALL, M. (2002c), A research and institutional size based model for national university Web site interlinking, Journal of Documentation, 58 (6): 683-694.
THELWALL, M. (2002d), Evidence for the existence of geographic trends in university web site interlinking, Journal of Documentation, 58 (5): 563-574.
THELWALL, M. (2003), Web use and peer interconnectivity metrics for academic Web sites, Journal of Information Science, 29 (1): 11-20.
THELWALL, M., TANG, R. (2003), Disciplinary and linguistic considerations for academic Web linking: An exploratory hyperlink mediated study with Mainland China and Taiwan, Scientometrics, 58 (1): 153-179.
THELWALL, M. BINNS, R. HARRIES, G. PAGE-KENNEDY, T. PRICE E., WILKINSON, D. (2002a), European Union associated university Websites, Scientometrics, 53 (1): 95-111.
THELWALL, M., TANG, R., PRICE, E. (2003), Linguistic patterns of academic Web use in Western Europe, Scientometrics, 56 (3): 417-432.
VAUGHAN, L., THELWALL, M. (2003), Scholarly use of the Web: What are the key inducers of links to journal Web sites? Journal of the American Society for Information Science and Technology, 54 (1): 29-38.
WHITE, H. D., GRIFFITH, B. C. (1982), Author co-citation: a literature measure of intellectual structure. Journal of the American Society for Information Science, 32 (3): 163-172.
WILKINSON, D., HARRIES, G., THELWALL, M., PRICE, E. (2003), Motivations for academic Web site interlinking: Evidence for the Web as a novel source of information on informal scholarly communication, Journal of Information Science, 29 (1): 59-66.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Musgrove, P.B., Binns, R., Page-Kennedy, T. et al. A method for identifying clusters in sets of interlinking Web spaces. Scientometrics 58, 657–672 (2003). https://doi.org/10.1023/B:SCIE.0000006886.37828.4a
Issue Date:
DOI: https://doi.org/10.1023/B:SCIE.0000006886.37828.4a