Skip to main content

ContriSci: A BERT-Based Multitasking Deep Neural Architecture to Identify Contribution Statements from Research Papers

  • Conference paper
  • First Online:
Towards Open and Trustworthy Digital Societies (ICADL 2021)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 13133))

Included in the following conference series:

Abstract

With the rapid growth of scientific literature, it is becoming increasingly difficult to identify scientific contribution from the deluge of research papers. Automatically identifying the specific contribution made in a research paper would help quicker comprehension of the work, faster literature survey, comparison with the related works, etc. Here in this work, we investigate methods to automatically extract the contribution statements from research articles. We design a multitask deep neural network leveraging section identification and citance classification of scientific statements to predict whether a given scientific statement specifies a contribution or not. In the long-run, we envisage to create a knowledge graph of scientific contributions for machine comprehension and more straightforward navigation of research contributions in a particular domain. Our approach achieves the best performance over earlier methods (a relative improvement of 8.08% in terms of \(F_1\) score) for contributing sentence identification over a dataset of Natural Language Processing (NLP) papers. We make our code available at here (https://github.com/ammaarahmad1999/Sem-Eval-2021-Task-A).

K. Gupta and A. Ahmad—Equal Contribution.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 89.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 119.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Arxiv submission rate statistics arxiv e-print repository. https://arxiv.org/help/stats/2018_by_area. Accessed 15 July 2021

  2. Codalab - competition. https://competitions.codalab.org/competitions/25680#results. Accessed 15 July 2021

  3. Github - kermitt2/grobid: a machine learning software for extracting information from scholarly documents. https://github.com/kermitt2/grobid. Accessed 15 July 2021

  4. Overview—aasc. https://kmcs.nii.ac.jp/resource/AASC/AASC.html. Accessed 15 July 2021

  5. Scibert-allenai. https://huggingface.co/allenai/scibert_scivocab_uncased. Accessed 15 July 2021

  6. Beltagy, I., et al.: Proceedings of the second workshop on scholarly document processing. In: Proceedings of the Second Workshop on Scholarly Document Processing (2021)

    Google Scholar 

  7. Beltagy, I., Lo, K., Cohan, A.: SciBERT: a pretrained language model for scientific text. arXiv preprint arXiv:1903.10676 (2019)

  8. Brack, A., D’Souza, J., Hoppe, A., Auer, S., Ewerth, R.: Domain-independent extraction of scientific concepts from research articles. Adv. Inf. Retrieval 12035, 251 (2020)

    Article  Google Scholar 

  9. Caruana, R.: Multitask learning. Mach. Learn. 28(1), 41–75 (1997)

    Article  MathSciNet  Google Scholar 

  10. Chandrasekaran, M.K., et al.: Overview of the first workshop on scholarly document processing (SDP). In: Proceedings of the First Workshop on Scholarly Document Processing, pp. 1–6 (2020)

    Google Scholar 

  11. Cohan, A., Ammar, W., Van Zuylen, M., Cady, F.: Structural scaffolds for citation intent classification in scientific publications. arXiv preprint arXiv:1904.01608 (2019)

  12. D’Souza, J., Auer, S.: NLPContributions: an annotation scheme for machine reading of scholarly contributions in natural language processing literature. arXiv preprint arXiv:2006.12870 (2020)

  13. D’Souza, J., Auer, S., Pedersen, T.: SemEval-2021 task 11: NLPContributionGraph-structuring scholarly NLP contributions for a research knowledge graph. arXiv preprint arXiv:2106.07385 (2021)

  14. Gupta, S., Manning, C.D.: Analyzing the dynamics of research by extracting key aspects of scientific papers. In: Proceedings of 5th International Joint Conference on Natural Language Processing, pp. 1–9 (2011)

    Google Scholar 

  15. Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9, 1735–1780 (1997)

    Article  Google Scholar 

  16. Liu, H., Sarol, M.J., Kilicoglu, H.: Uiuc\_bionlp at semeval-2021 task 11: a cascade of neural models for structuring scholarly NLP contributions. arXiv preprint arXiv:2105.05435 (2021)

  17. Liu, P., Qiu, X., Huang, X.: Recurrent neural network for text classification with multi-task learning. arXiv preprint arXiv:1605.05101 (2016)

  18. Liu, P., Qiu, X., Huang, X.: Adversarial multi-task learning for text classification. arXiv preprint arXiv:1704.05742 (2017)

  19. Qi, P., Zhang, Y., Zhang, Y., Bolton, J., Manning, C.D.: Stanza: a python natural language processing toolkit for many human languages. arXiv preprint arXiv:2003.07082 (2020)

  20. Shailabh, S., Chaurasia, S., Modi, A.: Knowgraph@ iitk at semeval-2021 task 11: building knowledge graph for NLP research. arXiv preprint arXiv:2104.01619 (2021)

Download references

Acknowledgement

Asif Ekbal is a recipient of the Visvesvaraya Young Faculty Award and acknowledges Digital India Corporation, Ministry of Electronics and Information Technology, Government of India for supporting this research.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Komal Gupta .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2021 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Gupta, K., Ahmad, A., Ghosal, T., Ekbal, A. (2021). ContriSci: A BERT-Based Multitasking Deep Neural Architecture to Identify Contribution Statements from Research Papers. In: Ke, HR., Lee, C.S., Sugiyama, K. (eds) Towards Open and Trustworthy Digital Societies. ICADL 2021. Lecture Notes in Computer Science(), vol 13133. Springer, Cham. https://doi.org/10.1007/978-3-030-91669-5_34

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-91669-5_34

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-91668-8

  • Online ISBN: 978-3-030-91669-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics