Skip to main content
Log in

A hybrid mapping of information science

  • Published:
Scientometrics Aims and scope Submit manuscript

Abstract

Previous studies have shown that hybrid clustering methods that incorporate textual content and bibliometric information can outperform clustering methods that use only one of these components. In this paper we apply a hybrid clustering method based on Fisher’s inverse chisquare to integrate full-text with citations and to provide a mapping of the field of information science. We quantitatively and qualitatively asses the added value of such an integrated analysis and we investigate whether the clustering outcome is a better representation of the field by comparing with a text-only clustering and with another hybrid method based on linear combination of distance matrices. Our data set consists of almost 1000 articles and notes published in the period 2002–2004 in 5 representative journals. The optimal number of clusters for the field is 5, determined by using a combination of distance-based and stability-based methods. Term networks present the cognitive structure of the field and are complemented by the most representative publications. Three large traditional sub-disciplines, particularly, information retrieval, bibliometrics/scientometrics, and more social aspects, and two smaller clusters about patent analysis and webometrics, can be distinguished.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Similar content being viewed by others

References

  • Baeza-Yates, R., Ribeiro-Neto, B. (1999), Modern Information Retrieval. Cambridge: Addison-Wesley.

    Google Scholar 

  • Braam, R. R., Moed, H. F., Van Raan, A. F. J. (1991), Mapping of science by combined cocitation and word analysis. 2. Dynamic aspects. JASIS, 42: 252–266.

    Article  Google Scholar 

  • Batagelj, V., Mrvar, A. (2002), Pajek — Analysis and visualization of large networks. Graph Drawing, 2265: 477–478.

    Article  Google Scholar 

  • Ben-Hur, A., Elisseeff, A., Guyon, I. (2002), A stability based method for discovering structure in clustered data. In: Pacific Symposium on Biocomputing (vol. 7, pp. 6–17), Retrieved September 9, 2007 from: http://helix-web.stanford.edu/psb02/benhur.pdf.

    Google Scholar 

  • Berry, M., Dumais, S. T., O’Brien, G. W. (1995), Using linear algebra for intelligent information retrieval. SIAM Review, 37(4): 573–595.

    Article  MATH  MathSciNet  Google Scholar 

  • Calado, P., Ribeiro-Neto, B., Ziviani, N., Moura, E., Silva, I. (2003), Local versus global link information in the Web. ACM Transactions on Information Systems, 21: 42–63.

    Article  Google Scholar 

  • Calado, P., Cristo, M., Goncalves, M. A., De Moura, E. S., Ribeiro-Neto, B., Ziviani, N. (2006), Link-based similarity measures for the classification of Web documents. JASIST, 57: 208–221.

    Article  Google Scholar 

  • Cohn, D., Hofmann, T. (2001), The missing link — a probabilistic model of document content and hypertext connectivity. Neural Information Processing Systems, 13.

  • Deerwester, S., Dumais, S. T., Furnas, G. W., Landauer, T. K., Harshman, R. (1990), Indexing by latent semantic analysis. JASIS, 41(6): 391–407.

    Article  Google Scholar 

  • Dunning, T. (1993), Accurate methods for the statistics of surprise and coincidence. Computational Linguistics, 19(1): 61–74.

    Google Scholar 

  • Glenisson, P., Glänzel, W., Janssens, F., De Moor, B. (2005), Combining full text and bibliometric information in mapping scientific disciplines. Information Processing & Management, 41: 1548–1572.

    Article  Google Scholar 

  • Hatcher, E., Gospodnetiæ, O. (2004), Lucene in Action. New York: Manning Publications Co.

    Google Scholar 

  • Hedges, L. V., Olkin, I. (1985), Statistical Methods for Meta-analysis. San Diego: Academic Press.

    MATH  Google Scholar 

  • Jain, A., Dubes, R. (1988), Algorithms for Clustering Data. New Jersey: Prentice Hall.

    MATH  Google Scholar 

  • Janssens, F., Leta, J., Glänzel, W., De Moor, B. (2006)A, Towards mapping library and information science. Information Processing & Management, 42(6): 1614–1642.

    Article  Google Scholar 

  • Janssens, F., Tran Quoc, V., Glänzel, W., De Moor, B. (2006)B, Integration of textual content and link information for accurate clustering of science fields. In: V. P. Guerrero-Bote (Ed.), Proc. of the I Intl. Conf. on Multidisciplinary Information Sciences and Technologies (InSciT2006) (pp. 615–619), M’erida, Spain.

  • Janssens, F. (2007)A, Clustering of Scientific Fields by Integrating Text Mining and Bibliometrics. Ph.D. thesis, Faculty of Engineering, Katholieke Universiteit Leuven, Belgium, http://hdl.handle.net/1979/847.

    Google Scholar 

  • Janssens, F., Glänzel, W., De Moor, B. (2007)B, A hybrid mapping of information science. In: D. Torres-Salinas, H. Moed (Eds) Proc. of the 11th International Conference of the International Society for Scientometrics and Informetrics (ISSI2007) (pp. 408–420), Madrid, Spain.

  • Joachims, T., Cristianini, N., Shawe-Taylor, J. (2001), Composite kernels for hypertext categorisation. In: Proceedings of the 18th International Conference on Machine Learning (ICML) (pp. 250–257)

  • Kaufman, L., Rousseeuw, P. J. (1990), Finding Groups in Data: An Introduction to Cluster Analysis. New York: John Wiley and Sons Inc.

    Google Scholar 

  • Kessler, M. M. (1963), Bibliographic coupling between scientific papers. American Documentation, 14: 10–25.

    Article  Google Scholar 

  • Manning, C. D., Schütze, H. (2000), Foundations of Statistical Natural Language Processing. Cambridge: MIT Press.

    Google Scholar 

  • Modha, D. S., Spangler, W. S. (2000), Clustering hypertext with applications to web searching. ACM Conference on Hypertext (pp. 143–152).

  • Morris, S. A., Yen, G., Wu, Z., Asnake, B. (2003), Time line visualization of research fronts. Journal of the American Society for Information Science and Technology, 54(5): 413–422.

    Article  Google Scholar 

  • Morris, S. A., Yen, G. G. (2004), Crossmaps: Visualization of overlapping relationships in collections of journal papers. Proceedings of the National Academy of Sciences of the United States of America, 101: 5291–5296.

    Article  Google Scholar 

  • Mullins, N., Snizek, W., Oehler, K. (1988), The structural analysis of a scientific paper. In: A. F. J. Van Raan (Ed.), Handbook of Quantitative Studies of Science and Technology (pp. 81–105), New York: Elsevier Science.

    Google Scholar 

  • Porter, M. F. (1980), An algorithm for suffix stripping. Program, 14 (3): 130–137.

    Google Scholar 

  • Rousseeuw, P. J. (1987), Silhouettes: A graphical aid to the interpretation and validation of cluster analysis. Journal of Computational and Applied Mathematics, 20: 53–65.

    Article  MATH  Google Scholar 

  • Salton, G., Mcgill, M. J. (1986). Introduction to Modern Information Retrieval. New York: McGraw-Hill, Inc.

    Google Scholar 

  • Snizek, W., Oehler, K., Mullins, N. (1991). Textual and nontextual characteristics of scientific papers: Neglected science indicators. Scientometrics, 20 (1): 25–35.

    Article  Google Scholar 

  • Wang, Y., Kitsuregawa, M. (2002). Evaluating contents-link coupled web page clustering for web search results. In: Proc. of the 11th intl. Conf. on Information and Knowledge Management (CIKM) (pp. 499–506).

  • Ward, J. H. (1963). Hierarchical grouping to optimize an objective function. Journal of the American Statistical Association, 58: 236–244.

    Article  MathSciNet  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

About this article

Cite this article

Janssens, F., Glänzel, W. & De Moor, B. A hybrid mapping of information science. Scientometrics 75, 607–631 (2008). https://doi.org/10.1007/s11192-007-2002-7

Download citation

  • Received:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11192-007-2002-7

Keywords

Navigation