Research on citation mention times and contributions using a neural network

Wang, Weibin; Wang, Zheng; Yu, Tian; Pak, CholMyong; Yu, Guang

doi:10.1007/s11192-020-03711-2

Research on citation mention times and contributions using a neural network

Published: 21 September 2020

Volume 125, pages 2383–2400, (2020)
Cite this article

Scientometrics Aims and scope Submit manuscript

Weibin Wang¹,
Zheng Wang²,
Tian Yu³,
CholMyong Pak¹ &
…
Guang Yu¹

644 Accesses
Explore all metrics

Abstract

With the development of citation analysis, citation mention times are drawing more attention. Aiming to extract mention times more conveniently and quickly, this study focused on developing a high-accuracy citation recognition algorithm based on neural networks, thereby providing automatic extraction of the number of citation mentions in citing papers, and on assessing its performance in PDF papers with different citation styles. We also used this algorithm to study the distribution rule and contribution of citations to citing papers. The results showed that the proposed algorithm is feasible for use in citation-mention-related research and further verified that the statistical distribution of the number of citation mentions conforms to the generalised Pareto distribution. Meanwhile, references mentioned more than twice accounted for about 20–40% of the total and contributed more than other references.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Towards employing native information in citation function classification

Article 16 January 2022

Contextualised segment-wise citation function classification

Article 12 July 2023

A study on the citation situation within the citing paper: citation distribution of references according to mention frequency

Article 23 December 2017

References

Bergmark, D. (2000). Automatic extraction of reference linking information from online documents. Technical Report. Cornell University, USA.
Bergmark, D., Phempoonpanich, P., & Zhao, S. (2001). Scraping the ACM digital library. SIGIR Forum, 35(2), 1–7.
Article Google Scholar
Bergstrom, C. T., West, J. D., & Wiseman, M. A. (2008). The eigenfactor metrics. Journal of Neuroscience, 28(45), 11433–11434.
Article Google Scholar
Bertin, M., Atanassova, I., Gingras, Y., & Lariviere, V. (2016). The invariant distribution of references in scientific articles. Journal of the Association for Information Science and Technology, 67(1), 164–177.
Article Google Scholar
Blanford, C. F. (2016). Impact factors, citation distributions and journal stratification. Journal of Materials Science, 51, 10319–10322.
Article Google Scholar
Boyack, K. W., Van Eck, N. J., Colavizzac, G., & Waltman, L. (2018). Characterizing in-text citations in scientific articles: A large-scale analysis. Journal of Informetrics, 12(1), 59–73.
Article Google Scholar
Chubin, D. E., & Moitra, S. D. (1975). Content analysis of references: Adjunct or alternative to citation counting? Social Studies of Science, 5(4), 423–441.
Article Google Scholar
Councill, I. G., Giles, C. L., Han, H., & Manavoglu, E. (2005). Automatic acknowledgement indexing: Expanding the semantics of contribution in the CiteSeer digital library. In Proceedings of the third international conference on knowledge capture, Banff, Canada.
Deng, J., Dong, W., Socher, R., Li, L. J., Li, K., & Fei-Fei, L. (2009). ImageNet: A large-scale hierarchical image database. IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2009, 248–255.
Google Scholar
Ding, Y., Liu, X., Guo, C., & Cronin, B. (2013). The distribution of references across texts: Some implications for citation analysis. Journal of Informetrics, 7(3), 583–592.
Article Google Scholar
Fricke, S. (2018). Semantic Scholar. Journal of the Medical Library Association: JMLA, 106(1), 145.
Article Google Scholar
Garfield, E., & Sher, I. H. (1963). New factors in the evaluation of scientific literature through citation indexing. American Documentation, 14(3), 195–201.
Article Google Scholar
Giles, C. L., Bollacker, K. D., & Lawrence, S. (1998). CiteSeer: An automatic citation indexing system. In The third ACM conference on digital libraries, Pittsburgh, PA.
González-Pereira, B., Guerrero-Bote, V. P., & Moya-Anegón, F. (2010). A new approach to the metric of journals’ scientific prestige: The SJR indicator. Journal of Informetrics, 4(3), 379–391.
Article Google Scholar
Herlach, G. (1978). Can retrieval of information from citation indexes be simplified? Multiple mentions of a reference as a characteristic of the link between cited and citing article. Journal of the Association for Information Science & Technology, 29(6), 308–310.
Google Scholar
Hirsch, J. E. (2005). An index to quantify an individual’s scientific research output. PNAS, 102(46), 16569–16572.
Article Google Scholar
Hou, W. R., Li, M., & Niu, D. K. (2011). Counting citations in texts rather than reference lists to improve the accuracy of assessing scientific contribution. BioEssays, 33(10), 724–727.
Article Google Scholar
Hu, Z., Chen, C., & Liu, Z. (2013). Where are citations located in the body of scientific articles? A study of the distributions of citation locations. Journal of Informetrics, 7(4), 887–896.
Article Google Scholar
Hu, Z., Lin, G., Sun, T., & Hou, H. (2017). Understanding multiply mentioned references. Journal of Informetrics, 11(4), 948–958.
Article Google Scholar
Krizhevsky, A., Sutskever, I., & Hinton, G. E. (2012). Imagenet classification with deep convolutional neural networks. Advances in Neural Information Processing Systems, 25(2), 1097–1105.
Google Scholar
Larivière, V., Kiermer, V., MacCallum, C. J., McNutt, M., Patterson, M., Pulverer, B., et al. (2016). A simple proposal for the publication of journal citation distributions. BioRxiv. https://doi.org/10.1101/062109.
Article Google Scholar
Lopez, P. (2010). Automatic extraction and resolution of bibliographical references in patent documents. In Information retrieval facility conference, Berlin, Heidelberg.
Massey, F. J. (1951). The Kolmogorov–Smirnov test for goodness of fit. Journal of the American Statistical Association, 46(253), 68–78.
Article Google Scholar
Moed, H. F. (2010). Measuring contextual citation impact of scientific journals. Journal of Informetrics, 4(3), 265–277.
Article Google Scholar
Nisonger, T. E. (2008). The ‘‘80/20 rule’’ and core journals. The Serials Librarian, 55, 62–84.
Article Google Scholar
Powley, B., & Dale, R. (2007). Evidence-based information extraction for high-accuracy citation extraction and author name recognition. In Proceedings of the 8th RIAO international conference on large-scale semantic access to content, Pittsburgh, PA.
Pulli, K., Baksheev, A., Kornyakov, K., & Eruhimov, V. (2012). Real-time computer vision with opencv. Communications of the ACM, 55(6), 61–69.
Article Google Scholar
Rousseau, R. (2005). Median and percentile impact factors: A set of new indicators. Scientometrics, 63(3), 431–441.
Article Google Scholar
Sarawagi, S., Vydiswaran, V. G. V., Srinivasan, S., & Bhudhia, K. (2003). Resolving citations in a paper repository. ACM SIGKDD Explorations Newsletter, 5(2), 156–157.
Article Google Scholar
Tang, R., & Safer, M. A. (2008). Author-rated importance of cited references in biology and psychology publications. Journal of Documentation, 64(2), 246–272.
Article Google Scholar
Voos, H., & Dagaev, K. S. (1976). Are all citations equal? Or, Did we op. cit. your idem? Journal of Academic Librarianship, 1(6), 19–21.
Google Scholar
Waltman, L. (2016). A review of the literature on citation impact indicators. Journal of Informetrics, 10(2), 365–391.
Article Google Scholar
Waltman, L., van Eck, N. J., van Leeuwen, T. N., & Visser, M. S. (2013). Some modifications to the SNIP journal impact indicator. Journal of Informetrics, 7(2), 272–285.
Article Google Scholar
Wan, X., & Liu, F. (2014). WL-index: Leveraging citation mention number to quantify an individual’s scientific impact. Journal of the Association for Information Science and Technology, 65(12), 2509–2517.
Article Google Scholar
Wang, M., Jiao, S., Chai, K. H., & Chen, G. (2019a). Building journals’ long-term impact: Using indicators detected from the sustained active articles. Scientometrics, 121(1), 261–283.
Article Google Scholar
Wang, M., Ren, J., Li, S., & Chen, G. (2019b). Quantifying a paper’s academic impact by distinguishing the unequal intensities and contributions of citations. IEEE Access, 7, 96198–96214.
Article Google Scholar
Zhang, X., Zou, J., Le, D. X., & Thoma, G. R. (2011). A structural SVM approach for reference parsing. BMC Bioinformatics, 12(S3), S7.
Article Google Scholar
Zhao, D., Cappello, A., & Johnston, L. (2017). Functions of uni- and multi-citations: Implications for weighted citation analysis. Journal of Data and Information Science, 2(1), 51–69.
Article Google Scholar
Zhao, D., & Strotmann, A. (2015). Re-citation analysis: Promising for research evaluation, knowledge network analysis, knowledge representation, and information retrieval?. In Proceedings of the 15th international society for scientometrics and informetrics conference, Istanbul, Turkey.
Zhao, D., & Strotmann, A. (2016). Dimensions and uncertainties of author citation rankings: Lessons learned from frequency-weighted in-text citation counting. Journal of the Association for Information Science and Technology, 67(3), 671–682.
Article Google Scholar
Zhao, D., & Strotmann, A. (2020). Deep and narrow impact: Introducing location filtered citation counting. Scientometrics, 122(1), 503–517.
Article Google Scholar
Zhu, X., Turney, P., Lemire, D., & Vellino, A. (2015). Measuring academic influence: Not all citations are equal. Journal of the Association for Information Science and Technology, 66(2), 408–427.
Article Google Scholar
Zou, J., Le, D., & Thoma, G. R. (2010). Locating and parsing bibliographic references in HTML medical articles. International Journal on Document Analysis and Recognition (IJDAR), 13(2), 107–119.
Article Google Scholar

Download references

Acknowledgements

This study was funded by the National Natural Science Foundation of China (Grant Nos. 71704035 and 71531013). The authors wish to express their sincere appreciation to the editors and reviewers of this paper.

Author information

Authors and Affiliations

School of Management, Harbin Institute of Technology, Harbin, 150001, China
Weibin Wang, CholMyong Pak & Guang Yu
School of Business Planning, Chongqing Technology and Business University, Chongqing, 400067, China
Zheng Wang
School of Economics and Management, Harbin Engineering University, Harbin, 150001, China
Tian Yu

Authors

Weibin Wang
View author publications
You can also search for this author inPubMed Google Scholar
Zheng Wang
View author publications
You can also search for this author inPubMed Google Scholar
Tian Yu
View author publications
You can also search for this author inPubMed Google Scholar
CholMyong Pak
View author publications
You can also search for this author inPubMed Google Scholar
Guang Yu
View author publications
You can also search for this author inPubMed Google Scholar

Corresponding author

Correspondence to Guang Yu.

Appendices

Appendix 1: Algorithm used in this study

Appendix 2: Journals used in this study (2013–2017)

Discipline	No	Journal
1. Biochemistry and molecular biology	1	Biochimie
1. Biochemistry and molecular biology	2	Journal of Molecular Graphics and Modelling
2. Biophysics	3	Bioelectrochemistry
2. Biophysics	4	Biophysical Chemistry
3. Computer science	5	Computers in Industry
3. Computer science	6	Information and Computation
4. Computer science and information systems	7	Data and Knowledge Engineering
4. Computer science and information systems	8	International Journal of Medical Informatics
5. Construction and building technology	9	Cement and Concrete Composites
5. Construction and building technology	10	Cement and Concrete Research
6. Engineering, chemical	11	Advanced Powder Technology
6. Engineering, chemical	12	International Journal of Adhesion and Adhesives
7. Engineering, electrical and electronic	13	Microelectronics Journal
7. Engineering, electrical and electronic	14	Optical Fiber Technology
8. Medicine	15	Advances in Medical Sciences
8. Medicine	16	Forensic Science International
9. Operations research and management	17	Decision Support Systems
9. Operations research and management	18	Operations Research Letters
10. Statistics and probability	19	Journal of Multivariate Analysis
10. Statistics and probability	20	Stochastic Processes and their Applications

Rights and permissions

Reprints and permissions

About this article

Cite this article

Wang, W., Wang, Z., Yu, T. et al. Research on citation mention times and contributions using a neural network. Scientometrics 125, 2383–2400 (2020). https://doi.org/10.1007/s11192-020-03711-2

Download citation

Received: 04 April 2020
Published: 21 September 2020
Issue Date: December 2020
DOI: https://doi.org/10.1007/s11192-020-03711-2

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Research on citation mention times and contributions using a neural network

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Towards employing native information in citation function classification

Contextualised segment-wise citation function classification

A study on the citation situation within the citing paper: citation distribution of references according to mention frequency

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Appendices

Appendix 1: Algorithm used in this study

Appendix 2: Journals used in this study (2013–2017)

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now