ContriSci: A BERT-Based Multitasking Deep Neural Architecture to Identify Contribution Statements from Research Papers

Gupta, Komal; Ahmad, Ammaar; Ghosal, Tirthankar; Ekbal, Asif

doi:10.1007/978-3-030-91669-5_34

Komal Gupta¹¹,
Ammaar Ahmad¹¹,
Tirthankar Ghosal¹² &
…
Asif Ekbal¹¹

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 13133))

Included in the following conference series:

International Conference on Asian Digital Libraries

1137 Accesses
4 Citations

Abstract

With the rapid growth of scientific literature, it is becoming increasingly difficult to identify scientific contribution from the deluge of research papers. Automatically identifying the specific contribution made in a research paper would help quicker comprehension of the work, faster literature survey, comparison with the related works, etc. Here in this work, we investigate methods to automatically extract the contribution statements from research articles. We design a multitask deep neural network leveraging section identification and citance classification of scientific statements to predict whether a given scientific statement specifies a contribution or not. In the long-run, we envisage to create a knowledge graph of scientific contributions for machine comprehension and more straightforward navigation of research contributions in a particular domain. Our approach achieves the best performance over earlier methods (a relative improvement of 8.08% in terms of \(F_1\) score) for contributing sentence identification over a dataset of Natural Language Processing (NLP) papers. We make our code available at here (https://github.com/ammaarahmad1999/Sem-Eval-2021-Task-A).

K. Gupta and A. Ahmad—Equal Contribution.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 89.00; Price excludes VAT (USA)

Softcover Book: USD 119.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Arxiv submission rate statistics arxiv e-print repository. https://arxiv.org/help/stats/2018_by_area. Accessed 15 July 2021
Codalab - competition. https://competitions.codalab.org/competitions/25680#results. Accessed 15 July 2021
Github - kermitt2/grobid: a machine learning software for extracting information from scholarly documents. https://github.com/kermitt2/grobid. Accessed 15 July 2021
Overview—aasc. https://kmcs.nii.ac.jp/resource/AASC/AASC.html. Accessed 15 July 2021
Scibert-allenai. https://huggingface.co/allenai/scibert_scivocab_uncased. Accessed 15 July 2021
Beltagy, I., et al.: Proceedings of the second workshop on scholarly document processing. In: Proceedings of the Second Workshop on Scholarly Document Processing (2021)
Google Scholar
Beltagy, I., Lo, K., Cohan, A.: SciBERT: a pretrained language model for scientific text. arXiv preprint arXiv:1903.10676 (2019)
Brack, A., D’Souza, J., Hoppe, A., Auer, S., Ewerth, R.: Domain-independent extraction of scientific concepts from research articles. Adv. Inf. Retrieval 12035, 251 (2020)
Article Google Scholar
Caruana, R.: Multitask learning. Mach. Learn. 28(1), 41–75 (1997)
Article MathSciNet Google Scholar
Chandrasekaran, M.K., et al.: Overview of the first workshop on scholarly document processing (SDP). In: Proceedings of the First Workshop on Scholarly Document Processing, pp. 1–6 (2020)
Google Scholar
Cohan, A., Ammar, W., Van Zuylen, M., Cady, F.: Structural scaffolds for citation intent classification in scientific publications. arXiv preprint arXiv:1904.01608 (2019)
D’Souza, J., Auer, S.: NLPContributions: an annotation scheme for machine reading of scholarly contributions in natural language processing literature. arXiv preprint arXiv:2006.12870 (2020)
D’Souza, J., Auer, S., Pedersen, T.: SemEval-2021 task 11: NLPContributionGraph-structuring scholarly NLP contributions for a research knowledge graph. arXiv preprint arXiv:2106.07385 (2021)
Gupta, S., Manning, C.D.: Analyzing the dynamics of research by extracting key aspects of scientific papers. In: Proceedings of 5th International Joint Conference on Natural Language Processing, pp. 1–9 (2011)
Google Scholar
Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9, 1735–1780 (1997)
Article Google Scholar
Liu, H., Sarol, M.J., Kilicoglu, H.: Uiuc\_bionlp at semeval-2021 task 11: a cascade of neural models for structuring scholarly NLP contributions. arXiv preprint arXiv:2105.05435 (2021)
Liu, P., Qiu, X., Huang, X.: Recurrent neural network for text classification with multi-task learning. arXiv preprint arXiv:1605.05101 (2016)
Liu, P., Qiu, X., Huang, X.: Adversarial multi-task learning for text classification. arXiv preprint arXiv:1704.05742 (2017)
Qi, P., Zhang, Y., Zhang, Y., Bolton, J., Manning, C.D.: Stanza: a python natural language processing toolkit for many human languages. arXiv preprint arXiv:2003.07082 (2020)
Shailabh, S., Chaurasia, S., Modi, A.: Knowgraph@ iitk at semeval-2021 task 11: building knowledge graph for NLP research. arXiv preprint arXiv:2104.01619 (2021)

Download references

Acknowledgement

Asif Ekbal is a recipient of the Visvesvaraya Young Faculty Award and acknowledges Digital India Corporation, Ministry of Electronics and Information Technology, Government of India for supporting this research.

Author information

Authors and Affiliations

Department of Computer Science and Engineering, Indian Institute of Technology Patna, Patna, India
Komal Gupta, Ammaar Ahmad & Asif Ekbal
Institute of Formal and Applied Linguistics, Faculty of Mathematics and Physics, Charles University, Prague, Czech Republic
Tirthankar Ghosal

Authors

Komal Gupta
View author publications
You can also search for this author in PubMed Google Scholar
Ammaar Ahmad
View author publications
You can also search for this author in PubMed Google Scholar
Tirthankar Ghosal
View author publications
You can also search for this author in PubMed Google Scholar
Asif Ekbal
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Komal Gupta .

Editor information

Editors and Affiliations

National Taiwan Normal University, Taipei, Taiwan
Hao-Ren Ke
Nanyang Technological University, Singapore, Singapore
Chei Sian Lee
Kyoto University, Kyoto, Japan
Kazunari Sugiyama

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Gupta, K., Ahmad, A., Ghosal, T., Ekbal, A. (2021). ContriSci: A BERT-Based Multitasking Deep Neural Architecture to Identify Contribution Statements from Research Papers. In: Ke, HR., Lee, C.S., Sugiyama, K. (eds) Towards Open and Trustworthy Digital Societies. ICADL 2021. Lecture Notes in Computer Science(), vol 13133. Springer, Cham. https://doi.org/10.1007/978-3-030-91669-5_34

Download citation

DOI: https://doi.org/10.1007/978-3-030-91669-5_34
Published: 30 November 2021
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-91668-8
Online ISBN: 978-3-030-91669-5
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

ContriSci: A BERT-Based Multitasking Deep Neural Architecture to Identify Contribution Statements from Research Papers