Skip to main content

A Unified Information Extraction System Based on Role Recognition and Combination

  • Conference paper
  • First Online:
Natural Language Processing and Chinese Computing (NLPCC 2021)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 13029))

Abstract

In this paper, we propose a unified information extraction system, which handles event extraction (EE) and relation extraction (RE) tasks. Given context and schema, event extraction aims to extract the events and the specific roles in the events, and relation extraction extracts all SPO triples. We formulate event extraction and relation extraction as one extraction schema, that is, role recognition and role combination. We use Multi-Label Pointer Network (MLPN) to recognize composite roles that contain both event/relation and role information and simultaneously train a Co-occurrence Matrix (CM) to determine the co-occurrence relationship of composite roles, i.e., whether two roles describe the same event/relation. Using such a Unified model based on Role Recognition and Combination (URRC) and corresponding combination strategy, we implement three tasks: sentence-level event extraction, document-level event extraction, and relation extraction. In LIC 2021, our model achieved 6th in the Multi-format Information Extraction racing track with an average \(F_1\) score of 77.44% in the final test dataset of three subtasks.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 79.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 99.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    https://huggingface.co/hfl/chinese-roberta-wwm-ext.

  2. 2.

    https://huggingface.co/hfl/chinese-roberta-wwm-ext-large.

References

  1. Chen, Y., Xu, L., Liu, K., Zeng, D., Zhao, J.: Event extraction via dynamic multi-pooling convolutional neural networks. In: Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), pp. 167–176 (2015)

    Google Scholar 

  2. Cui, Y., Che, W., Liu, T., Qin, B., Wang, S., Hu, G.: Revisiting pre-trained models for Chinese natural language processing. In: Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: Findings, pp. 657–668. Association for Computational Linguistics, Online, November 2020. https://www.aclweb.org/anthology/2020.findings-emnlp.58

  3. Cui, Y., et al.: Pre-training with whole word masking for Chinese BERT. arXiv preprint arXiv:1906.08101 (2019)

  4. Devlin, J., Chang, M.W., Lee, K., Toutanova, K.: BERT: pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018)

  5. Kearns, M., Ron, D.: Algorithmic stability and sanity-check bounds for leave-one-out cross-validation. Neural Comput. 11(6), 1427–1453 (1999)

    Google Scholar 

  6. Kingma, D., Ba, J.: Adam: a method for stochastic optimization. Comput. Sci. (2014)

    Google Scholar 

  7. Li, Q., Ji, H., Huang, L.: Joint event extraction via structured prediction with global features. In: Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 73–82 (2013)

    Google Scholar 

  8. Li, S., et al.: DuIE: a large-scale chinese dataset for information extraction. In: Tang, J., Kan, M.-Y., Zhao, D., Li, S., Zan, H. (eds.) NLPCC 2019. LNCS (LNAI), vol. 11839, pp. 791–800. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-32236-6_72

    Chapter  Google Scholar 

  9. Li, X., Feng, J., Meng, Y., Han, Q., Wu, F., Li, J.: A unified MRC framework for named entity recognition. arXiv preprint arXiv:1910.11476 (2019)

  10. Li, X., et al.: DuEE: a large-scale dataset for Chinese event extraction in real-world scenarios. In: Zhu, X., Zhang, M., Hong, Yu., He, R. (eds.) NLPCC 2020. LNCS (LNAI), vol. 12431, pp. 534–545. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-60457-8_44

    Chapter  Google Scholar 

  11. Liu, M., Tu, Z., Wang, Z., Xu, X.: LTP: a new active learning strategy for BERT-CRF based named entity recognition. arXiv preprint arXiv:2001.02524 (2020)

  12. Liu, Y., et al.: RoBERTa: a robustly optimized BERT pretraining approach. arXiv preprint arXiv:1907.11692 (2019)

  13. Mooney, R., Bunescu, R.: Mining knowledge from text using information extraction. SIGKDD Explor. 7, 3–10 (2005). https://doi.org/10.1145/1089815.1089817

  14. Srivastava, N., Hinton, G., Krizhevsky, A., Sutskever, I., Salakhutdinov, R.: Dropout: a simple way to prevent neural networks from overfitting. J. Mach. Learn. Res. 15(1), 1929–1958 (2014)

    Google Scholar 

  15. Vinyals, O., Fortunato, M., Jaitly, N.: Pointer networks. arXiv preprint arXiv:1506.03134 (2015)

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yadong Zhang .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2021 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Zhang, Y., Lan, M. (2021). A Unified Information Extraction System Based on Role Recognition and Combination. In: Wang, L., Feng, Y., Hong, Y., He, R. (eds) Natural Language Processing and Chinese Computing. NLPCC 2021. Lecture Notes in Computer Science(), vol 13029. Springer, Cham. https://doi.org/10.1007/978-3-030-88483-3_36

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-88483-3_36

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-88482-6

  • Online ISBN: 978-3-030-88483-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics