Abstract
This study proposes a method for assessing the impact of scientific communication on their citation paths beyond conventional direct citations. The proposed method considers the contribution of scientific communication to their nth generation citations as a basis for calculating residual citations. Residual citations that are lost due to citation practices termed “Obliteration by Incorporation” and the “Palimpsestic Syndrome” in consequent citations in the second, third or nth generations are reconstituted. The proposed method is based on the semantic similarity between the citation contexts of a publication and those of its nth generation citations in their n + 1th generation citations. The proposed method was demonstrated using a sample of biomedical publications with ten base articles and their five generations of citations. Like the cascading citation system, residual citations accruing to articles from their generations of citations decreased as the number of generations increased. However, residual citation weights accrued to publications at all generation levels were statistically different between the proposed residual citation and the cascading citation system. This method introduces a new frontier that assesses the depth of impact of a publication (beyond the conventional direct citation level).




Similar content being viewed by others
Data availability
The datasets generated and/or analyzed during the current study are available in the Mendeley repository, through https://doi.org/10.17632/6fgjxkv28d.3.
References
An, J., Kim, N., Kan, M.-Y., Chandrasekaran, M. K., & Song, M. (2017). Exploring characteristics of highly cited authors according to citation location and content. Journal of the Association for Information Science and Technology, 68(8), 1975–1988.
Asubiaro, T. V. (2021). Exploiting semantic similarity between citation contexts for direct citation weighting and residual citation [Doctoral Thesis, The University of Western Ontario]. https://ir.lib.uwo.ca/etd/8008/
Asubiaro, T. V., & Ajiferuke, I. (2021). A proposed method for residual citation allocation based on citation contexts similarity. Researchsquare. https://doi.org/10.21203/rs.3.rs-1041491/v1
Athar, A., & Teufel, S. (2012). Detection of implicit citations for sentiment detection. Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics, 18–26.
Boyack, K. W., van Eck, N. J., Colavizza, G., & Waltman, L. (2018). Characterizing in-text citations in scientific articles: A large-scale analysis. Journal of Informetrics, 12(1), 59–73. https://doi.org/10.1016/j.joi.2017.11.005
Chen, Q., Peng, Y., & Lu, Z. (2019). BioSentVec: Creating sentence embeddings for biomedical texts. In The Seventh IEEE International Conference on Healthcare Informatics (p. 5). IEEE: Beijing, China. https://doi.org/10.1109/ICHI.2019.8904728
Cohan, A., Ammar, W., van Zuylen, M., & Cady, F. (2019). Structural scaffolds for citation intent classification in scientific publications. Proceedings of the 2019 Conference of the North, 3586–3596. https://doi.org/10.18653/v1/N19-1361
Dervos, D. A., & Kalkanis, T. (2005). cc-IFF: A cascading citations impact factor framework for the automatic ranking of research publications. IEEE Intelligent Data Acquisition and Advanced Computing Systems: Technology and Applications, 2005, 668–673. https://doi.org/10.1109/IDAACS.2005.283070
Ding, Y., Liu, X., Guo, C., & Cronin, B. (2013). The distribution of references across texts: Some implications for citation analysis. Journal of Informetrics, 7(3), 583–592. https://doi.org/10.1016/j.joi.2013.03.003
Ding, Y., Zhang, G., Chambers, T., Song, M., Wang, X., & Zhai, C. (2014). Content-based citation analysis: The next generation of citation analysis. Journal of the Association for Information Science and Technology, 65(9), 1820–1833.
Doslu, M., & Bingol, H. O. (2016). Context sensitive article ranking with citation context analysis. Scientometrics, 108(2), 653–671. https://doi.org/10.1007/s11192-016-1982-6
Einstein, A. (1905). Zur Elektrodynamik bewegter Körper. Annalen Der Physik, 322(10), 891–921. https://doi.org/10.1002/andp.19053221004
Fragkiadaki, E., Evangelidis, G., Samaras, N., & Dervos, D. A. (2009). Cascading citations indexing framework algorithm implementation and testing. 2009 13th Panhellenic Conference on Informatics, 70–74. https://doi.org/10.1109/PCI.2009.30
Han, M., Zhang, X., Yuan, X., Jiang, J., Yun, W., & Gao, C. (2021). A survey on the techniques, applications, and performance of short text semantic similarity. Concurrency and Computation: Practice and Experience, 33(5), e5971. https://doi.org/10.1002/cpe.5971
Hassan, S.-U., Akram, A., & Haddawy, P. (2017). Identifying important citations using contextual information from full text. Digital Libraries (JCDL), 2017 ACM/IEEE Joint Conference On, 1–8.
Herlach, G. (1976). Can retrieval of information from citation indexes be simplified? Multiple mention of a reference as a characteristic of the link between cited and citing article. Journal of the American Society for Information Science, 29(6), 308.
HernáNdez-Alvarez, M., & Gomez, J. M. (2016). Survey about citation context analysis: Tasks, techniques, and resources. Natural Language Engineering, 22(03), 327–349. https://doi.org/10.1017/S1351324915000388
Hirsch, J. E. (2005). An index to quantify an individual’s scientific research output. Proceedings of the National academy of Sciences U S A, 102(46), 16569–16572. https://doi.org/10.1073/pnas.0507655102
Hu, Z., Chen, C., & Liu, Z. (2013). Where are citations located in the body of scientific articles? A study of the distributions of citation locations. Journal of Informetrics, 7(4), 887–896. https://doi.org/10.1016/j.joi.2013.08.005
Jeong, Y. K., Song, M., & Ding, Y. (2014). Content-based author co-citation analysis. Journal of Informetrics, 8(1), 197–211. https://doi.org/10.1016/j.joi.2013.12.001
Li, X., He, Y., Meyers, A., & Grishman, R. (2013). Towards fine-grained citation function classification. Proceedings of Recent Advances in Natural Language Processing, 402–407.
Lithgow-Serrano, O., Gama-Castro, S., Ishida-Gutiérrez, C., Mejía-Almonte, C., Tierrafría, V. H., Martínez-Luna, S., Santos-Zavaleta, A., Velázquez-Ramírez, D., & Collado-Vides, J. (2019). Similarity corpus on microbial transcriptional regulation. Journal of Biomedical Semantics, 10(1), 8. https://doi.org/10.1186/s13326-019-0200-x
Maričić, S., Spaventi, J., Pavičić, L., & Pifat-Mrzljak, G. (1998). Citation context versus the frequency counts of citation histories. Journal of the American Society for Information Science, 49(6), 530–540. https://doi.org/10.1002/(SICI)1097-4571(19980501)49:6%3c530::AID-ASI5%3e3.0.CO;2-U
McCain, K. W. (2014). Assessing obliteration by incorporation in a full-text database: JSTOR, economics, and the concept of “bounded rationality.” Scientometrics, 101(2), 1445–1459. https://doi.org/10.1007/s11192-014-1237-3
McKeown, K., Daume, H., Chaturvedi, S., Paparrizos, J., Thadani, K., Barrio, P., Biran, O., Bothe, S., Collins, M., Fleischmann, K. R., Gravano, L., Jha, R., King, B., McInerney, K., Moon, T., Neelakantan, A., O’Seaghdha, D., Radev, D., Templeton, C., & Teufel, S. (2016). Predicting the impact of scientific concepts using full-text features. Journal of the Association for Information Science and Technology, 67(11), 2684–2696. https://doi.org/10.1002/asi.23612
Meng, R., Lu, W., Chui, Y., & Shuguang, H. (2017). Automatic classification of citation function by new linguistic features. IConference 2017 Proceedings, 826–830. https://doi.org/10.9776/17349
Merton, R. K. (1965). On the shoulders of giants a shandean postscript-free press. The Free Press.
Merton, R. K. (1988). The matthew effect in science, II: Cumulative advantage and the symbolism of intellectual property. Isis, 79(4), 606–623. https://doi.org/10.1086/354848
Pride, D., & Knoth, P. (2017). Incidental or influential?—A decade of using text-mining for citation function classification. 16th International Society of Scientometrics and Informetrics Conference, Wuhan, China.
Ritchie, A., Robertson, S., & Teufel, S. (2008). Comparing citation contexts for information retrieval. Proceeding of the 17th ACM Conference on Information and Knowledge Mining - CIKM ’08, 213. https://doi.org/10.1145/1458082.1458113
Singha Roy, S., Mercer, R. E., & Urra, F. (2020). Investigating citation linkage as a sentence similarity measurement task using deep learning. In C. Goutte & X. Zhu (Eds.), Advances in artificial intelligence (pp. 483–495). Springer International Publishing. https://doi.org/10.1007/978-3-030-47358-7_50
Soğancıoğlu, G., Öztürk, H., & Özgür, A. (2017). BIOSSES: A semantic sentence similarity estimation system for the biomedical domain. Bioinformatics, 33(14), i49–i58. https://doi.org/10.1093/bioinformatics/btx238
Stremersch, S., Camacho, N., Vanneste, S., & Verniers, I. (2015). Unraveling scientific impact: Citation types in marketing journals. International Journal of Research in Marketing, 32(1), 64–77. https://doi.org/10.1016/j.ijresmar.2014.09.004
Strotmann, A., & Zhao, D. (2014). Uncertainty of author citation rankings: Lessons from in-text citation weighing schemes. Proceedings of the Association for Information Science and Technology, 51(1), 1–4.
Teufel, S., Siddharthan, A., & Tidhar, D. (2006). Automatic classification of citation function. Proceedings of the 2006 Conference on Empirical Methods in Natural Language Processing - EMNLP ’06, 103. https://doi.org/10.3115/1610075.1610091
Tuarob, S., Kang, S. W., Wettayakorn, P., Pornprasit, C., Sachati, T., Hassan, S.-U., & Haddawy, P. (2020). Automatic classification of algorithm citation functions in scientific literature. IEEE Transactions on Knowledge and Data Engineering, 32(10), 1881–1896. https://doi.org/10.1109/TKDE.2019.2913376
Valenzuela, M., Ha, V., & Etzioni, O. (2015). Identifying meaningful citations. AAAI Workshops, 21–26.
Voos, H., & Dagaev, K. (1976). Are all citations equal? Or, did we op. cit. your idem? The Journal of Academic Librarianship, 1(6), 19–21.
Wan, X., & Liu, F. (2014). Are all literature citations equally important? Automatic citation strength estimation and its applications. Journal of the Association for Information Science and Technology, 65(9), 1929–1938. https://doi.org/10.1002/asi.23083
Wang, Y., Afzal, N., Fu, S., Wang, L., Shen, F., Rastegar-Mojarad, M., & Liu, H. (2018). MedSTS: A resource for clinical semantic textual similarity. Language Resources and Evaluation. https://doi.org/10.1007/s10579-018-9431-1
Yang, X., He, X., Zhang, H., Ma, Y., Bian, J., & Wu, Y. (2020). Measurement of semantic textual similarity in clinical texts: Comparison of transformer-based models. JMIR Medical Informatics, 8(11), e19735. https://doi.org/10.2196/19735
Zhao, D., & Strotmann, A. (2014). In-text author citation analysis: Feasibility, benefits, and limitations. Journal of the Association for Information Science and Technology, 65(11), 2348–2358. https://doi.org/10.1002/asi.23107
Zhao, D., & Strotmann, A. (2016). Dimensions and uncertainties of author citation rankings: Lessons learned from frequency-weighted in-text citation counting. Journal of the Association for Information Science and Technology, 67(3), 671–682. https://doi.org/10.1002/asi.23418
Zhu, X., Turney, P., Lemire, D., & Vellino, A. (2015). Measuring academic influence: Not all citations are equal. Journal of the Association for Information Science and Technology, 66(2), 408–427. https://doi.org/10.1002/asi.23179
Acknowledgements
This study is a modified version of a pre-print (Asubiaro & Ajiferuke, 2021) that was deposited in Research Square. This journal publication is part of the first author’s doctoral thesis and contains texts that have been copied verbatim from the thesis. The contributions of the first author's doctoral thesis committee members- Professor Robert Mercer, Computer Science Department, University of Western Ontario, London, Canada and Professor Victoria Rubin, Library and Information Science Program, University of Western Ontario, London, Canada are acknowledged.
Funding
The first author received the Western Graduate Research Scholarships from September 2016 and September 2020 and Ontario Graduate Scholarships and the Queen Elizabeth II Graduate Scholarships in Science and Technology (OGS/QEII-GSST), 2019 summer term to 2020 winter term.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors reported no potential competing interests.
Rights and permissions
Springer Nature or its licensor holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Asubiaro, T.V., Ajiferuke, I. Semantic similarity-based credit attribution on citation paths: a method for allocating residual citation to and investigating depth of influence of scientific communications. Scientometrics 127, 6257–6277 (2022). https://doi.org/10.1007/s11192-022-04522-3
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11192-022-04522-3