Multi-task learning model for citation intent classification in scientific publications

Qi, Ruihua; Wei, Jia; Shao, Zhen; Li, Zhengguang; Chen, Heng; Sun, Yunhao; Li, Shaohua

doi:10.1007/s11192-023-04858-4

Multi-task learning model for citation intent classification in scientific publications

Published: 28 October 2023

Volume 128, pages 6335–6355, (2023)
Cite this article

Scientometrics Aims and scope Submit manuscript

Ruihua Qi ORCID: orcid.org/0000-0002-2583-3055^1,2,
Jia Wei²,
Zhen Shao²,
Zhengguang Li¹,
Heng Chen¹,
Yunhao Sun¹ &
…
Shaohua Li¹

369 Accesses
Explore all metrics

Abstract

Citations play a significant role in the evaluation of scientific literature and researchers. Citation intent analysis is essential for academic literature understanding. Meanwhile, it is useful for enriching semantic information representation for the citation intent classification task because of the rapid growth of publicly accessible full-text literature. However, some useful information that is readily available in citation context and facilitates citation intent analysis has not been fully explored. Furthermore, some deep learning models may not be able to learn relevant features effectively due to insufficient training samples of citation intent analysis tasks. Multi-task learning aims to exploit useful information between multiple tasks to help improve learning performance and exhibits promising results on many natural language processing tasks. In this paper, we propose a joint semantic representation model, which consists of pretrained language models and heterogeneous features of citation intent texts. Considering the correlation between citation intents, citation section and citation worthiness classification tasks, we build a multi-task citation classification framework with soft parameter sharing constraint and construct independent models for multiple tasks to improve the performance of citation intent classification. The experimental results demonstrate that the heterogeneous features and the multi-task framework with soft parameter sharing constraint proposed in this paper enhance the overall citation intent classification performance.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

"Challenges and future in deep learning for sentiment analysis: a comprehensive review and a proposed novel hybrid approach"

Article Open access 05 March 2024

Impact of word embedding models on text analytics in deep learning environment: a review

Article 22 February 2023

Recommendation system based on deep learning methods: a systematic review and new directions

Article 03 August 2019

References

Beltagy, I., Lo, K., & Cohan, A. (2019). SciBERT: A pretrained language model for scientific text. Preprint at http://arXiv.org/arXiv:1903.10676
Cohan, A., Ammar, W., Van Zuylen, M., & Cady, F. (2019). Structural scaffolds for citation intent classification in scientific publications. Preprint at http://arXiv.org/arXiv:1904.01608
de Andrade, C. M. V., & Gonçalves, M. A. (2020). Combining representations for effective citation classification. In Proceedings of the 8th International Workshop on Mining Scientific Publications: 54–58.
Devlin, J., Chang, M. W., Lee, K., & Toutanova, K. (2018). Bert: Pre-training of deep bidirectional transformers for language understanding. Preprint at http://arXiv.org/arXiv:1810.04805
Dong, C., Schäfer, U.(2011). Ensemble-style self-training on citation classification, Proceedings of the 5th International Joint Conference on Natural Language Processing. 623–631.
Garfield, E. (1972). Citation analysis as a tool in journal evaluation: Journals can be ranked by frequency and impact of citations for science policy studies. Science, 178(4060), 471–479.
Article Google Scholar
Hassan, N. R., & Serenko, A. (2019). Patterns of citations for the growth of knowledge: A Foucauldian perspective. Journal of Documentation., 75(3), 593–611.
Article Google Scholar
Hassan, S. U., Imran, M., Iqbal, S., Aljohani, N. R., & Nawaz, R. (2018). Deep context of citations using machine-learning models in scholarly full-text articles. Scientometrics, 117(3), 1645–1662.
Article Google Scholar
Hu, T., Li, J., Fukumoto, F., & Zhou, R. (2022). A multi-task based Bilateral-Branch Network for imbalanced citation intent classification. In 2022 16th International Conference on Ubiquitous Information Management and Communication. 1–8.
Jiang, X., & Chen, J. (2023). Contextualised segment-wise citation function classification. Scientometrics, 1–42.
Jochim, C., & Schütze, H. (2012). Towards a generic and flexible citation classifier based on a faceted classification scheme. In Proceedings of COLING. 1343-1358
Jurgens, D., Kumar, S., Hoover, R., McFarland, D., & Jurafsky, D. (2018). Measuring the evolution of a scientific field through citation frames. Transactions of the Association for Computational Linguistics, 6, 391–406.
Article Google Scholar
Lauscher, A., Ko, B., Kuehl, B., Johnson, S., Jurgens, D., Cohan, A., & Lo, K. (2021). MultiCite: Modeling realistic citations requires moving beyond the single-sentence single-label setting. Preprint at http://arXiv.org/arXiv-2107
Lyu, D., Ruan, X., Xie, J., & Cheng, Y. (2021). The classification of citing motivations: A meta-synthesis. Scientometrics, 126(4), 3243–3264.
Article Google Scholar
Maheshwari, H., Singh, B., & Varma, V. (2021). Scibert sentence representation for citation context classification. In Proceedings of the Second Workshop on Scholarly Document Processing. 130–133.
Oesterling, A., Ghosal, A., Yu, H., Xin, R., Baig, Y., Semenova, L., & Rudin, C. (2021). Multitask learning for citation purpose classification. Preprint at http://arXiv.org/arXiv:2106.13275
Paice, C. D. (1990). Constructing literature abstracts by computer: Techniques and prospects. Information Processing & Management, 26(1), 171–186.
Article Google Scholar
Prester, J., Wagner, G., Schryen, G., & Hassan, N. R. (2021). Classifying the ideational impact of information systems review articles: A content-enriched deep learning approach. Decision Support Systems, 140, 113432.
Article Google Scholar
Pride, D., Knoth, P., & Harag, J. (2019). ACT: an annotation platform for citation typing at scale. In ACM/IEEE Joint Conference on Digital Libraries. 329–330.
Qayyum, F., & Afzal, M. T. (2019). Identification of important citations by exploiting research articles’ metadata and cue-terms from content. Scientometrics, 118(1), 21–43.
Article Google Scholar
Qi, R. H., Wei, J., Shao Z., Guo X., Chen H. (2022b). Domain Sentiment Lexicon Representation Learning Based on Multi-source Knowledge Fusion. In Proceedings of the 21st Chinese National Conference on Computational Linguistics, 684–693. https://aclanthology.org/2022.ccl-1.61/
Qi, R. H., Yang, M. X., Jian, Y., Li, Z. G., & Chen, H. (2022a). A Local context focus learning model for joint multi-task using syntactic dependency relative distance. Applied Intelligence. https://doi.org/10.1007/s10489-022-03684-0
Article Google Scholar
Roman, M., Shahid, A., Khan, S., Koubaa, A., & Yu, L. (2021). Citation intent classification using word embedding. IEEE Access, 9, 9982–9995.
Article Google Scholar
Ruder, S. (2017). An Overview of Multi-Task Learning in Deep Neural Networks. Preprint at http://arXiv.org/arXiv1706.05098
Su, X., Prasad, A., Kan, M. Y., & Sugiyama, K. (2019). Neural multi-task learning for citation function and provenance. In ACM/IEEE Joint Conference on Digital Libraries. 394–395.
Teufel, S., & Moens, M. (2002). Summarizing scientific articles: Experiments with relevance and rhetorical status. Computational Linguistics, 28(4), 409–445.
Article Google Scholar
Teufel, S., Siddharthan, A., & Tidhar, D. (2006). Automatic classification of citation function. In Proceedings of the 2006 conference on empirical methods in natural language processing. 103–110.
Tuarob, S., Kang, S. W., Wettayakorn, P., Pornprasit, C., Sachati, T., Hassan, S. U., & Haddawy, P. (2019). Automatic classification of algorithm citation functions in scientific literature. IEEE Transactions on Knowledge and Data Engineering, 32(10), 1881–1896.
Article Google Scholar
Valenzuela, M., Ha, V., & Etzioni, O. (2015). Identifying meaningful citations. In Workshops at the twenty-ninth AAAI conference on artificial intelligence (15): 13
Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., ... & Polosukhin, I. (2017). Attention is all you need. Advances in neural information processing systems, 30.
Xu, H., Martin, E., & Mahidadia, A. (2013). Using heterogeneous features for scientific citation classification. In Proceedings of the 13th conference of the Pacific Association for Computational Linguistics.
Yousif, A., Niu, Z., Chambua, J., & Khan, Z. Y. (2019). Multi-task learning model based on recurrent convolutional neural networks for citation sentiment and purpose classification. Neurocomputing, 335, 195–205.
Article Google Scholar
Zhang, Y., Wang, Y., Sheng, Q. Z., Mahmood, A., Emma Zhang, W., & Zhao, R. (2021). TDM-CFC: Towards Document-Level Multi-label Citation Function Classification. In International Conference on Web Information Systems Engineering (pp. 363–376).
Zhang, Y., & Yang, Q. (2018). An overview of multi-task learning. National Science Review, 5(1), 30–43.
Article Google Scholar
Zhang, Y., Zhao, R., Wang, Y., Chen, H., Mahmood, A., Zaib, M., Zhang, W. E., & Sheng, Q. Z. (2022). Towards employing native information in citation function classification. Scientometrics. https://doi.org/10.1007/s11192-021-04242-0
Article Google Scholar
Zhu, X., Turney, P., Lemire, D., & Vellino, A. (2015). Measuring academic influence: Not all citations are equal. Journal of the Association for Information Science and Technology, 66(2), 408–427.
Article Google Scholar

Download references

Acknowledgements

This work is partially supported by grant from the Applied Basic Research Project of Liaoning Province (No. 2022JH2/101300270), the Scientific Research Innovation Team Project of Dalian University of Foreign Languages (No. 2016CXTD06)

Author information

Authors and Affiliations

School of Software, Dalian University of Foreign Languages, Dalian, Liaoning, People’s Republic of China
Ruihua Qi, Zhengguang Li, Heng Chen, Yunhao Sun & Shaohua Li
Research Center for Language Intelligence, Dalian University of Foreign Languages, Dalian, Liaoning, People’s Republic of China
Ruihua Qi, Jia Wei & Zhen Shao

Authors

Ruihua Qi
View author publications
You can also search for this author in PubMed Google Scholar
Jia Wei
View author publications
You can also search for this author in PubMed Google Scholar
Zhen Shao
View author publications
You can also search for this author in PubMed Google Scholar
Zhengguang Li
View author publications
You can also search for this author in PubMed Google Scholar
Heng Chen
View author publications
You can also search for this author in PubMed Google Scholar
Yunhao Sun
View author publications
You can also search for this author in PubMed Google Scholar
Shaohua Li
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Ruihua Qi.

Appendix

See Tables 7 and 8.

Table 7 Experiment results of using feature set as the input of the single task of citation intention classification

Full size table

Table 8 Experimental results of different auxiliary tasks on SciCite dataset

Full size table

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Qi, R., Wei, J., Shao, Z. et al. Multi-task learning model for citation intent classification in scientific publications. Scientometrics 128, 6335–6355 (2023). https://doi.org/10.1007/s11192-023-04858-4

Download citation

Received: 07 September 2022
Accepted: 13 October 2023
Published: 28 October 2023
Issue Date: December 2023
DOI: https://doi.org/10.1007/s11192-023-04858-4

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Multi-task learning model for citation intent classification in scientific publications

Abstract

Access this article

Similar content being viewed by others

"Challenges and future in deep learning for sentiment analysis: a comprehensive review and a proposed novel hybrid approach"

Impact of word embedding models on text analytics in deep learning environment: a review

Recommendation system based on deep learning methods: a systematic review and new directions

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Appendix

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Multi-task learning model for citation intent classification in scientific publications

Abstract

Access this article

Similar content being viewed by others

"Challenges and future in deep learning for sentiment analysis: a comprehensive review and a proposed novel hybrid approach"

Impact of word embedding models on text analytics in deep learning environment: a review

Recommendation system based on deep learning methods: a systematic review and new directions

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Appendix

Appendix

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation