Abstract
The introduction of textual analysis and the use of lexical similarities already proved an important asset in science mapping. Earlier research showed the added value of hybrid document networks over link-based ones through the reduction of the extreme sparseness. However, it was only after the application of Natural Language Processing and phrase extraction that networks purely based on lexical similarities could be used as input for topic detection in quantitative science studies. This study investigates the contribution of the lexical component in hybrid cluster on a set of articles published in the journal Scientometrics since its foundation during four decades. Shifting the weight of the lexical components generates changes in the structure of the underlying hybrid network, which can be detected through clustering techniques. We show that these changes are not moving documents randomly, but in fact identify small groups of papers either at the borderline between different topics or combining those. In addition, the analysis substantiates that the lexical component adopts the structure of the network rather than amplifies hidden structures of the link-based network.

Data sourced from Clarivate Analytics Web of Science Core Collection

Data sourced from Clarivate Analytics Web of Science Core Collection

Data sourced from Clarivate Analytics Web of Science Core Collection
Similar content being viewed by others
References
Blondel, V. D., Guillaume, J. L., Lambiotte, R., & Lefebvre, E. (2008). Fast unfolding of communities in large networks. Journal of Statistical Mechanics: Theory and Experiment, 10, P10008.
Boyack, K. W., & Klavans, R. (2010). Co-citation analysis, bibliographic coupling, and direct citation: Which citation approach represents the research front most accurately? Journal of the American Society for Information Science and Technology, 61(12), 2389–2404.
Callon, M., Courtial, J. P., & Laville, F. (1991). Co-word analysis as a tool for describing the network of interactions between basic and technological research: The case of polymer chemistry. Scientometrics, 22(1), 155–205.
Garfield, E. (1969). Permuterm Subject Index—The primordial dictionary of science. Current Contents, 12(22), 4.
Glänzel, W., & Thijs, B. (2011). Using `core documents’ for the representation of clusters and topics. Scientometrics, 88(1), 297–309.
Glänzel, W., & Thijs, B. (2012). Hybrid solutions—The best of all possible worlds? Bibliometrie & Praxis und Forschung, 1(3), URN:urn:nbn:de:bvb:355-152-4.
Glänzel, W., & Thijs, B. (2017). Using hybrid methods and core documents for the representation of clusters and topics. The astronomy dataset. Scientometrics, 111(2), 1071–1087.
Glenisson, P., Glänzel, W., Janssens, F., & de Moor, B. (2005). Combining full text and bibliometric information in mapping scientific disciplines. Information Processing and Management, 41(6), 1548–1572.
Good, B. H., de Montojoye, Y.-A., & Clauset, A. (2010). Performance of modularity maximization in practical contexts. Physical Review E, 81, 046106.
Janssens, F., Glänzel, W., & de Moor, B. (2008). A hybrid mapping of information science. Scientometrics, 75(3), 607–631.
Manning, Ch. D., Surdeanu, M., Bauer, J., Finkel, J., Bethard, S. J., & McClosky, D. (2014). The Stanford CoreNLP Natural Language Processing Toolkit. In Proceedings of the 52nd annual meeting of the association for computational linguistics: system demonstrations (pp. 55–60).
Thijs, B., Glänzel, W., & Meyer, M. (2017). Improved lexical similarities for hybrid clustering through the use of noun phrases extraction. FEB Research Report MSI_1703, MSI_1703. Leuven (Belgium): KU Leuven, Faculty of Economics and Business.
Todorov, R., & Winterhager, M. (1990). Mapping Australian geophysics—A co-heading analysis. Scientometrics, 19(1–2), 35–56.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Thijs, B., Glänzel, W. The contribution of the lexical component in hybrid clustering, the case of four decades of “Scientometrics”. Scientometrics 115, 21–33 (2018). https://doi.org/10.1007/s11192-018-2659-0
Received:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11192-018-2659-0