Skip to main content
Log in

Closer in time and higher correlation: disclosing the relationship between citation similarity and citation interval

  • Published:
Scientometrics Aims and scope Submit manuscript

Abstract

Investigating the intricate relationship between citation similarity and the citation interval offers vital insights for refining citation recommendation systems and enhancing citation evaluation models. This is also a new perspective for understanding citation patterns. In this study, we used the Library and Information Science (LIS) field as an example to determine and discuss the correlation between citation similarity and the citation interval. Using the methods of data collection, paper title preprocessing, text vectorization based on simCSE, calculation of citation similarity and the citation interval, and calculation of the index per citing paper, this study found the following LIS domain-based results: (i) there is a significant negative correlation between citation similarity and the citation interval, but the correlation coefficient is low. (ii) The citation intervals of the least relevant series of cited papers exhibit a more pronounced susceptibility to citation similarity than the most relevant series of cited papers. (iii) The citation intervals of the most relevant cited papers are more concentrated within 12 years and more likely to be published within the average citation interval, typically from the newer half of the cited paper list and published later within 5 years of the citation half-life. This study concludes that researchers usually pay more attention to the latest and most cutting-edge and strongly relevant existing research than to weakly relevant existing research. Continuous attention and timely incorporation of knowledge into the research direction will promote a more rapid and specialized diffusion of knowledge. These findings are influenced by the accelerated dissemination of information via Internet, heightened academic competition, and the concentration of research endeavors in specialized disciplines. This study not only contributes to the scholarly discussion of citation analysis but also lays the foundation for future exploration and understanding of citation patterns.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

Similar content being viewed by others

References

  • Aistleitner, M., Kapeller, J., & Steinerberger, S. (2019). Citation patterns in economics and beyond. Science in Context, 32(4), 361–380.

    Article  Google Scholar 

  • Aksnes, D. W., Langfeldt, L., & Wouters, P. (2019). Citations, citation indicators, and research quality: An overview of basic concepts and theories. Sage Open, 9(1). https://doi.org/10.1177/2158244019829575

    Article  Google Scholar 

  • Ali, Z., Qi, G., Kefalas, P., Khusro, S., Khan, I., & Muhammad, K. (2022). SPR-SMN: Scientific paper recommendation employing SPECTER with memory network. Scientometrics, 127(11), 6763–6785.

    Article  Google Scholar 

  • Beel, J., Gipp, B., Langer, S., & Breitinger, C. (2016). Paper recommender systems: A literature survey. International Journal on Digital Libraries, 17, 305–338.

    Article  Google Scholar 

  • Bornmann, L., Haunschild, R., & Leydesdorff, L. (2018). Reference publication year spectroscopy (RPYS) of Eugene Garfield’s publications. Scientometrics, 114, 439–448.

    Article  Google Scholar 

  • Bornmann, L., Tekles, A., Zhang, H. H., & Fred, Y. Y. (2019). Do we measure novelty when we analyze unusual combinations of cited references? A validation study of bibliometric novelty indicators based on F1000Prime data. Journal of Informetrics, 13(4), 100979.

    Article  Google Scholar 

  • Buscaldi, D., Dessí, D., Motta, E., Murgia, M., Osborne, F., & Recupero, D. R. (2024). Citation prediction by leveraging transformers and natural language processing heuristics. Information Processing and Management, 61(1), 103583.

    Article  Google Scholar 

  • Chen, C. (2006). CiteSpace II: Detecting and visualizing emerging trends and transient patterns in scientific literature. Journal of the American Society for Information, Science and Technology, 57(3), 359–377.

    Article  Google Scholar 

  • Chen, L. (2017). Do patent citations indicate knowledge linkage? The evidence from text similarities between patents and their citations. Journal of Informetrics, 11(1), 63–79.

    Article  Google Scholar 

  • Cui, Y., Wang, Y., Liu, X., Wang, X., & Zhang, X. (2023). Multidimensional scholarly citations: Characterizing and understanding scholars’ citation behaviors. Journal of the Association for Information Science and Technology, 74(1), 115–127.

    Article  Google Scholar 

  • Ding, J., Liu, C., & Yuan, Y. (2023). The characteristics of knowledge diffusion of library and information science—From the perspective of citation. Library Hi Tech, 41(4), 1099–1118.

    Article  Google Scholar 

  • Dixon, W. J. (1950). Analysis of extreme values. The Annals of Mathematical Statistics, 21(4), 488–506.

    Article  MathSciNet  Google Scholar 

  • Ethayarajh, K. (2019). How contextual are contextualized word representations? Comparing the geometry of BERT, ELMo, and GPT-2 embeddings. arXiv preprint arXiv:1909.00512

  • Fleming, L. (2001). Recombinant uncertainty in technological search. Management Science, 47(1), 117–132.

    Article  Google Scholar 

  • Gao, T., Yao, X., & Chen, D. (2021). SimCSE: Simple contrastive learning of sentence embeddings. arXiv preprint arXiv:2104.08821

  • Garfield, E., & Merton, R. K. (1979). Citation indexing: Its theory and application in science, technology, and humanities (Vol. 8). Wiley.

  • Hwa, R. (2004). Sample selection for statistical parsing. Computational Linguistics, 30(3), 253–276.

    Article  MathSciNet  Google Scholar 

  • Jatnika, D., Bijaksana, M. A., & Suryani, A. A. (2019). Word2Vec model analysis for semantic similarities in English words. Procedia Computer Science, 157, 160–167.

    Article  Google Scholar 

  • Järvelin, K., Chang, Y. W., & Vakkari, P. (2023). Characteristics of LIS research articles affecting their citation impact. Journal of Librarianship and Information Science. https://doi.org/10.1177/09610006231196344

    Article  Google Scholar 

  • Jurgens, D., Kumar, S., Hoover, R., McFarland, D., & Jurafsky, D. (2018). Measuring the evolution of a scientific field through citation frames. Transactions of the Association for Computational Linguistics, 6, 391–406.

    Article  Google Scholar 

  • Kammari, M. (2023). Time-stamp based network evolution model for citation networks. Scientometrics, 128(6), 3723–3741.

    Article  Google Scholar 

  • Kim, M., Baek, I., & Song, M. (2018). Topic diffusion analysis of a weighted citation network in biomedical literature. Journal of the Association for Information Science and Technology, 69(2), 329–342.

    Article  Google Scholar 

  • Kuhn, T. S. (1970). The structure of scientific revolutions (Vol. 111). University of Chicago Press.

  • Liang, G., Hou, H., Ding, Y., & Hu, Z. (2020). Knowledge recency to the birth of Nobel Prize-winning articles: Gender, career stage, and country. Journal of Informetrics, 14(3), 101053.

    Article  Google Scholar 

  • Liu, Y., & Chen, M. (2021). Applying text similarity algorithm to analyze the triangular citation behavior of scientists. Applied Soft Computing, 107, 107362.

    Article  Google Scholar 

  • Lu, Y., Yuan, M., Liu, J., & Chen, M. (2023). Research on semantic representation and citation recommendation of scientific papers with multiple semantics fusion. Scientometrics, 128(2), 1367–1393.

    Article  Google Scholar 

  • Marx, W., Bornmann, L., Barth, A., & Leydesdorff, L. (2014). Detecting the historical roots of research fields by reference publication year spectroscopy (RPYS). Journal of the Association for Information Science and Technology, 65(4), 751–764.

    Article  Google Scholar 

  • Nassiri, I., Masoudi-Nejad, A., Jalili, M., & Moeini, A. (2013). Normalized similarity index: An adjusted index to prioritize article citations. Journal of Informetrics, 7(1), 91–98.

    Article  Google Scholar 

  • Niraula, N., Banjade, R., Ştefănescu, D., & Rus, V. (2013). Experiments with semantic similarity measures based on LDA and LSA. In Statistical language and speech processing: First international conference, SLSP 2013: Proceedings 1, Tarragona, Spain, July 29–31, 2013 (pp. 188–199). Springer.

  • Pagani, R. N., Kovaleski, J. L., & Resende, L. M. (2015). Methodi Ordinatio: A proposed methodology to select and rank relevant scientific papers encompassing the impact factor, number of citations, and year of publication. Scientometrics, 105, 2109–2135.

    Article  Google Scholar 

  • Petruzzelli, A. M., Ardito, L., & Savino, T. (2018). Maturity of knowledge inputs and innovation value: The moderating effect of firm age and size. Journal of Business Research, 86, 190–201.

    Article  Google Scholar 

  • Pornprasit, C., Liu, X., Kiattipadungkul, P., Kertkeidkachorn, N., Kim, K. S., Noraset, T., ... & Tuarob, S. (2022). Enhancing citation recommendation using citation network embedding. Scientometrics, 127(9), 1–32.

  • Rodriguez-Prieto, O., Araujo, L., & Martinez-Romo, J. (2019). Discovering related scientific literature beyond semantic similarity: A new co-citation approach. Scientometrics, 120, 105–127.

    Article  Google Scholar 

  • Rohde, D. L., Gonnerman, L. M., & Plaut, D. C. (2006). An improved model of semantic similarity based on lexical co-occurrence. Communications of the ACM, 8(627–633), 116.

    Google Scholar 

  • Rubin, R. E., & Rubin, R. G. (2020). Foundations of library and information science. American Library Association.

  • Sharma, R., Gopalani, D., & Meena, Y. (2023). An anatomization of research paper recommender system: Overview, approaches and challenges. Engineering Applications of Artificial Intelligence, 118, 105641.

    Article  Google Scholar 

  • Sheng, L., Lyu, D., Ruan, X., Shen, H., & Cheng, Y. (2023). The association between prior knowledge and the disruption of an article. Scientometrics, 128(8), 1–21.

    Article  Google Scholar 

  • Slyder, J. B., Stein, B. R., Sams, B. S., Walker, D. M., Jacob Beale, B., Feldhaus, J. J., & Copenheaver, C. A. (2011). Citation pattern and lifespan: A comparison of discipline, institution, and individual. Scientometrics, 89(3), 955–966.

    Article  Google Scholar 

  • Smith, T. B., Vacca, R., Krenz, T., & McCarty, C. (2021). Great minds think alike, or do they often differ? Research topic overlap and the formation of scientific teams. Journal of Informetrics, 15(1), 101104.

    Article  Google Scholar 

  • Su, W. H., Chen, K. Y., Lu, L. Y., & Huang, Y. C. (2021). Identification of technology diffusion by citation and main paths analysis: The possibility of measuring open innovation. Journal of Open Innovation: Technology, Market, and Complexity, 7(1), 104.

    Article  Google Scholar 

  • Synnestvedt, M. B., Chen, C., & Holmes, J. H. (2005). CiteSpace II: Visualization and knowledge discovery in bibliographic databases. In AMIA annual symposium proceedings, 2005 (Vol. 2005, p. 724). American Medical Informatics Association.

  • Tantanasiriwong, S., & Haruechaiyasak, C. (2014, May). Cross-domain citation recommendation based on co-citation selection. In 2014 11th International conference on electrical engineering/electronics, computer, telecommunications and information technology (ECTI-CON), 2014 (pp. 1–4). IEEE.

  • Tata, S., & Patel, J. M. (2007). Estimating the selectivity of TF–IDF based cosine similarity predicates. ACM Sigmod Record, 36(2), 7–12.

    Article  Google Scholar 

  • Thor, A., Marx, W., Leydesdorff, L., & Bornmann, L. (2016). Introducing CitedReferencesExplorer (CRExplorer): A program for reference publication year spectroscopy with cited references standardization. Journal of Informetrics, 10(2), 503–515.

    Article  Google Scholar 

  • West, J. D., Wesley-Smith, I., & Bergstrom, C. T. (2016). A recommendation system based on hierarchical clustering of an article-level citation network. IEEE Transactions on Big Data, 2(2), 113–123.

    Article  Google Scholar 

  • Wu, X., Gao, C., Zang, L., Han, J., Wang, Z., & Hu, S. (2021). ESimCSE: Enhanced sample building method for contrastive learning of unsupervised sentence embedding. arXiv preprint arXiv:2109.04380

  • Yang, A. J. (2024). Unveiling the impact and dual innovation of funded research. Journal of Informetrics, 18(1), 101480.

    Article  Google Scholar 

  • Zhang, J., & Hou, J. (2023). Knowledge diffusion for individual literature from the perspective of Altmetrics: Models, measurement and features. Journal of Information Science. https://doi.org/10.1177/01655515231174387

    Article  Google Scholar 

  • Zhang, J., & Zhu, L. (2022). Citation recommendation using semantic representation of cited papers’ relations and content. Expert Systems with Applications, 187, 115826.

    Article  Google Scholar 

  • Zhang, X., Xie, Q., & Song, M. (2021). Measuring the impact of novelty, bibliometric, and academic-network factors on citation count using a neural network. Journal of Informetrics, 15(2), 101140.

    Article  Google Scholar 

  • Zhou, H., Dong, K., & Xia, Y. (2023). Knowledge inheritance in disciplines: Quantifying the successive and distant reuse of references. Journal of the Association for Information Science and Technology, 74(13), 1515–1531.

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Dejun Zheng.

Ethics declarations

Conflict of interest

No conflicts of interest exist in the submission of this manuscript, and the manuscript has been approved for publication by all authors. We declare that the work described here is original research that has not been published previously and is not under consideration for publication elsewhere, in whole or in part. All the authors listed have approved the enclosed manuscript.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Cheng, W., Zheng, D., Fu, S. et al. Closer in time and higher correlation: disclosing the relationship between citation similarity and citation interval. Scientometrics 129, 4495–4512 (2024). https://doi.org/10.1007/s11192-024-05080-6

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11192-024-05080-6

Keywords