Skip to main content
Log in

Evaluating scientific impact of publications: combining citation polarity and purpose

  • Published:
Scientometrics Aims and scope Submit manuscript

Abstract

Citation counts are commonly used to evaluate the scientific impact of a publication on the general premise that more citations probably mean more endorsements. However, two questionable assumptions underpin this idea: a) that all authors contributed equally to the paper; and b) that the endorsement is positive. Obviously, neither of these assumptions hold true. Hence, with this study, we examine two components of citations—their purpose, i.e., the reason for the citation, and polarity, being the author’s attitude toward the cited work. Our findings provide a new perspective on the scientific impact of highly-cited publications. Our methodology consists of three steps. Firstly, a pre-trained model composed of a Word2Vec—a well-known word embedding approach—and a convolutional neural network (CNN) is used to identify citation polarity and purpose. Secondly, in a set of highly-cited papers, we compare eight categories of purpose from foundational to critical and three categories of polarity: positive, negative, and neutral. We further explore how different types of papers—those discussing discoveries or those discussing utilitarian topics—influence the evaluation of scientific impact of papers. Finally, we mine and discover the knowledge (e.g. method, concept, tool or data) to explain the actual scientific impact of a highly-cited paper. To demonstrate how combining citation polarity with purpose can provide far greater details of a paper’s scientific impact, we undertake a case study with 370 highly-cited journal articles spanning “Biochemistry & Molecular Biology” and “Genetics & Heredity”. The results yield valuable insights into the assumption about citation counts as a metric for evaluating scientific impact.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14

Similar content being viewed by others

Data availability

All data and materials support our published claims and comply with field standards.

Code availability

The software application or custom code support our published claims and comply with field standards.

References

  • Abu-Jbara, A., Ezra, J., & Radev, D. (2013). Purpose and polarity of citation: Towards NLP-based bibliometrics. In Proceedings of the 2013 conference of the North American chapter of the association for computational linguistics: Human language technologies (pp. 596–606).

  • Akella, A. P., Alhoori, H., Kondamudi, P. R., Freeman, C., & Zhou, H. (2021). Early indicators of scientific impact: Predicting citations with altmetrics. Journal of Informetrics, 15(2), 101128. https://doi.org/10.1016/j.joi.2020.101128

    Article  Google Scholar 

  • Athar, A. (2011, June). Sentiment analysis of citations using sentence structure-based features. In Proceedings of the ACL 2011 student session (pp. 81–87). Association for Computational Linguistics.

  • Athar, A., & Teufel, S. (2012, June). Context-enhanced citation sentiment detection. In Proceedings of the 2012 conference of the North American chapter of the Association for Computational Linguistics: Human language technologies (pp. 597–601). Association for Computational Linguistics.

  • Bergstrom, C. T., West, J. D., & Wiseman, M. A. (2008). The eigenfactor metrics. Journal of Neuroscience, 28(45), 11433–11434. https://doi.org/10.1016/j.poly.2005.08.020

    Article  Google Scholar 

  • Bonzi, S. (1982). Characteristics of a literature as predictors of relatedness between cited and citing works. Journal of the American Society for Information Science, 33(4), 208–216. https://doi.org/10.1002/asi.4630330404

    Article  Google Scholar 

  • Bornmann, L., & Leydesdorff, L. (2017). Skewness of citation impact data and covariates of citation distributions: A large-scale empirical analysis based on web of science data. Journal of Informetrics, 11(1), 164–175. https://doi.org/10.1016/j.joi.2016.12.001

    Article  Google Scholar 

  • Brin, S.,Page, L.,Motwami, R., &Winograd, T. (1998). The PageRank Citation Ranking:Bringing Order to the Web. Stanford Digital Libraries Working Paper, (6), 102–107.

  • Bu, Y., Ludo, W., & Huang, Y. (2021). A multi-dimensional framework for characterizing the citation impact of scientific publications. Quantitative Science Studies, 2, 1–40. https://doi.org/10.1162/qss_a_00109

    Article  Google Scholar 

  • Butler, D. (2008). Free Journal-Ranking Tool Enters Citation Market. Nature, 451, 6.

    Article  Google Scholar 

  • Butt, B. H., Rafi, M., Jamal, A., Rehman, R. S. U., Alam, S. M. Z., & Alam, M. B. (2015). Classification of research citations (CRC). arXiv preprint arXiv:1506.08966.

  • Chi, P. S., & Glanzel, W. (2017). An empirical investigation of the associations among usage, scientific collaboration and citation impact. Scientometrics, 112(1), 403–412. https://doi.org/10.1007/s11192-017-2356-4

    Article  Google Scholar 

  • Crane, D. (1972). Invisible colleges: Diffusion of knowledge in scientific communities. The University of Chicago Press.

    Google Scholar 

  • Egghe, L. (2006). Theory and Practice of the G-index. Scientometrics, 1(69), 131–152.

    Article  Google Scholar 

  • Egghe, L. (2011). The single publication index of papers in the hirsch-core of a researcher and the indirect index. Scientometrics, 89(3), 727–739. https://doi.org/10.1007/s11192-011-0483-x

    Article  Google Scholar 

  • Fujiwara, T., & Yamamoto, Y. (2015). Colil: A database and search service for citation contexts in the life sciences domain. Journal of Biomedical Semantics, 6(1), 1–11. https://doi.org/10.1186/s13326-015-0037-x

    Article  Google Scholar 

  • Garfield, E. (1972). Citation analysis as a tool in journal evaluation. Science, 178(4060), 471–9. https://doi.org/10.1126/science.178.4060.471

    Article  Google Scholar 

  • Garfield, E. (1979). Is citation analysis a legitimate evaluation tool? Scientometrics, 1(4), 359–375. https://doi.org/10.1007/BF02016602

    Article  Google Scholar 

  • Garfield, E., & Merton, R. K. (1979). Citation indexing: Its theory and application in science, technology, and humanities (Vol. 8). Wiley.

    Google Scholar 

  • Hernández-Alvarez, M., & Gómez, J. M. (2015, October). Citation impact categorization: for scientific literature. In 2015 IEEE 18th International Conference on Computational Science and Engineering (pp. 307–313). IEEE.

  • Hernández-Alvarez, M., Soriano, J. M. G., & Martínez-Barco, P. (2017). Citation function, polarity and influence classification. Natural Language Engineering, 23(4), 561–588. https://doi.org/10.1007/s11192-019-03028-9

    Article  Google Scholar 

  • Hirsch, J. E. (2005). An index to quantify an individual's scientific research output. In Proceedings of the National Academy of ences of the United States of America (Vol. 102, pp. 16569–16572). https://doi.org/10.1073/pnas.0507655102

  • Hutchins, B. I., Yuan, X., Anderson, J. M., Santangelo, G. M., & Vaux, D. L. (2016). Relative citation ratio (RCR): A new metric that uses citation rates to measure influence at the article level. PLoS Biology, 14(9), e1002541.

    Article  Google Scholar 

  • Ikram, M. T., & Afzal, M. T. (2019). Aspect based citation sentiment analysis using linguistic patterns for better comprehension of scientific knowledge. Scientometrics, 119(1), 73–95. https://doi.org/10.1007/s11192-019-03028-9

    Article  Google Scholar 

  • Jha, R., Jbara, A. A., Qazvinian, V., & Radev, D. R. (2017). NLP-driven citation analysis for scientometrics. Natural Language Engineering, 23(1), 93–130. https://doi.org/10.1017/S1351324915000443

    Article  Google Scholar 

  • Jiang, X., & Zhuge, H. (2019). Forward search path count as an alternative indirect citation impact indicator. Journal of Informetrics, 13(4), 1–28. https://doi.org/10.1016/j.joi.2019.100977

    Article  Google Scholar 

  • Jochim, C., & Schütze, H. (2014, June). Improving citation polarity classification with product reviews. In Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (pp. 42–48).

  • Jochim, C., & Schütze, H. (2012, December). Towards a generic and flexible citation classifier based on a faceted classification scheme. In Proceedings of International Conference on Computational Linguistics 2012 (pp. 1343–1358).

  • Kim, I. C., & Thoma, G. R. (2015, August). Automated classification of author's sentiments in citation using machine learning techniques: A preliminary study. In 2015 IEEE Conference on Computational Intelligence in Bioinformatics and Computational Biology (pp. 1–7). IEEE.

  • Koshland, D. E. (2007). The cha-cha-cha theory of scientific discovery. Science, 317(5839), 761–762. https://doi.org/10.1126/science.1147166

    Article  Google Scholar 

  • Kosmulski, M. (2006). A new hirsch-type index saves time and works equally well as the original H-index. ISSI Newsletter, 2(3), 4–6.

    Google Scholar 

  • Lauscher, A., Glavaš, G., Ponzetto, S. P., & Eckert, K. (2017, December). Investigating convolutional networks and domain-specific embeddings for semantic classification of citations. In Proceedings of the 6th International Workshop on Mining Scientific Publications (pp. 24–28). ACM.

  • Leydesdorff, L., Bornmann, L., & Wagner, C. S. (2019). The relative influences of government funding and international collaboration on citation impact. Journal of the American Society for Information Science and Technology, 70(2), 198–201.

    Google Scholar 

  • Li, X., He, Y., Meyers, A., & Grishman, R. (2013). Towards fine-grained citation function classification. In Proceedings of the International Conference Recent Advances in Natural Language Processing RANLP 2013 (pp. 402–407).

  • Lin, C. S. (2018). An analysis of citation functions in the humanities and social sciences research from the perspective of problematic citation analysis assumptions. Scientometrics, 116(2), 797–813. https://doi.org/10.1007/s11192-018-2770-2

    Article  Google Scholar 

  • MacRoberts, M. H., & MacRoberts, B. R. (1984). The negational reference: Or the art of dissembling. Social Studies of Science, 14(1), 91–94. https://doi.org/10.1177/030631284014001006

    Article  Google Scholar 

  • Moravcsik, M. J., & Murugesan, P. (1975). Some results on the function and quality of citations. Social Studies of Science, 5(1), 86–92.

    Article  Google Scholar 

  • Nanba, H., & Okumura, M. (1999, July). Towards multi-paper summarization using reference information. In Proceedings of the 16th international joint conference on Artificial intelligence-Volume 2 (pp. 926–931). Morgan Kaufmann Publishers Inc.

  • Parthasarathy, G., & Tomar, D. C. (2014, September). Sentiment analyzer: analysis of journal citations from citation databases. In 2014 5th international conference-confluence the next generation information technology (pp. 923–928). IEEE.

  • Qayyum, F., & Afzal, M. T. (2019). Identification of important citations by exploiting research articles’ metadata and cue-terms from content. Scientometrics, 118(1), 21–43. https://doi.org/10.1007/s11192-018-2961-x

    Article  Google Scholar 

  • Schubert, A. (2009). Using the h-index for assessing single publications. Scientometrics, 78(3), 559–565. https://doi.org/10.1007/s11192-008-2208-3

    Article  Google Scholar 

  • Small, H., Tseng, H., & Patek, M. (2017). Discovering discoveries: Identifying biomedical discoveries using citation contexts. Journal of Informetrics, 11(1), 46–62. https://doi.org/10.1016/j.joi.2016.11.001

    Article  Google Scholar 

  • Spiegel-Rüsing, I. (1977). Science studies: Bibliometric and content analysis. Social Studies of Science, 7(1), 97–113. https://doi.org/10.1177/030631277700700111

    Article  Google Scholar 

  • Tahamtan, I., & Bornmann, L. (2019). What do citation counts measure? an updated review of studies on citations in scientific documents published between 2006 and 2018. Scientometrics, 121(3), 1635–1684. https://doi.org/10.1007/s11192-019-03243-4

    Article  Google Scholar 

  • Taşkın, Z., & Al, U. (2018). A content-based citation analysis study based on text categorization. Scientometrics, 114(1), 335–357. https://doi.org/10.1007/s11192-017-2560-2

    Article  Google Scholar 

  • Teufel, S., Siddharthan, A., & Tidhar, D. (2006, July). Automatic classification of citation function. In Proceedings of the 2006 conference on empirical methods in natural language processing (pp. 103–110). Association for Computational Linguistics.

  • Winnink, J. J., Tijssen, R. J. W., & van Raan, A. F. J. (2019). Searching for new breakthroughs in science: How effective are computerised detection algorithms? Technological Forecasting and Social Change, 146, 673–686. https://doi.org/10.1016/j.techfore.2018.05.018

    Article  Google Scholar 

  • Xu, H., Martin, E., & Mahidadia, A. (2013, September). Using heterogeneous features for scientific citation classification. In Proceedings of the 13th conference of the Pacific Association for Computational Linguistics.

  • Yan, E., Chen, Z., & Li, K. (2020). Authors’ status and the perceived quality of their work: Measuring citation sentiment change in Nobel articles. Journal of the Association for Information Science and Technology, 71(3), 314–324. https://doi.org/10.1002/asi.24237.

    Article  Google Scholar 

  • Yan, E., Wu, C., & Song, M. (2018). The funding factor: A cross-disciplinary examination of the association between research funding and citation impact. Scientometrics, 115(1), 369–384. https://doi.org/10.1007/s11192-017-2583-8.

  • Zhang, Y., Ma, J., Wang, Z., Chen, B., & Yu, Y. (2018). Collective topical pagerank: A model to evaluate the topic-dependent academic impact of scientific papers. Scientometrics, 114(3), 1345–1372. https://doi.org/10.1007/s11192-017-2626-1.

    Article  Google Scholar 

  • Zhang, Y., & Wallace, B. (2015). A sensitivity analysis of (and practitioners' guide to) convolutional neural networks for sentence classification. Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP). (pp. 1746–1751). Association for Computational Linguistics.

  • Zhou, Z., Shi, C., Hu, M., & Liu, Y. (2018). Visual ranking of academic influence via paper citation. Journal of Visual Languages & Computing, 48, 134–143. https://doi.org/10.1016/j.jvlc.2018.08.007

    Article  Google Scholar 

Download references

Acknowledgements

This work was supported by the General Program of the National Natural Science Foundation of China under Grant Nos. 72074020 and 71774012. The findings and observations in this paper are those of the authors and do not necessarily reflect the views of the supporters.

Funding

This work was supported by the General Program of National Natural Science Foundation of China under Grant Nos. 72074020 and 71774012.

Author information

Authors and Affiliations

Authors

Contributions

All authors contributed to the study conception and design. Material preparation, data collection and analysis were performed by Heng Huang. The first draft of the manuscript was written by Heng Huang and all authors commented on subsequent versions of the manuscript. Revisions to the manuscript were guided by Donghua Zhu and Xuefeng Wang. All authors read and approved the final paper.

Corresponding author

Correspondence to Xuefeng Wang.

Ethics declarations

Conflicts of interest

Not applicable.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Huang, H., Zhu, D. & Wang, X. Evaluating scientific impact of publications: combining citation polarity and purpose. Scientometrics 127, 5257–5281 (2022). https://doi.org/10.1007/s11192-021-04183-8

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11192-021-04183-8

Keywords

Navigation