Abstract
To tackle the data scarcity problem of document-level event extraction, we come up with a large-scale benchmark, DuEE-Fin, which consists of 15,000+ events categorized into 13 event types, and 81,000+ event arguments mapped in 92 argument roles. We constructed DuEE-Fin from real-world Chinese financial news, which allows one document to contain several events, multiple arguments to share the same argument role and one argument to play different roles in different events. Therefore, it presents some considerable challenges in document-level event extraction task such as multi-event recognition and multi-value argument identification, that are referred to as key issues for document-level event extraction task. Along with DuEE-Fin, we also hosted an open competition, which has attracted 1,690 teams and achieved exciting results. We performed experiments on DuEE-Fin with most popular document-level event extraction systems. However, results showed that even some SOTA models performed poorly with our data. Facing these challenges, we found it necessary to propose more effective methods.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Chen, M., et al.: Event-centric natural language processing. In: Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing: Tutorial Abstracts. pp. 6–14. Association for Computational Linguistics, Online (Aug 2021). https://doi.org/10.18653/v1/2021.acl-tutorials.2, https://aclanthology.org/2021.acl-tutorials.2
Ebner, S., Xia, P., Culkin, R., Rawlins, K., Van Durme, B.: Multi-sentence argument linking. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. pp. 8057–8077. Association for Computational Linguistics, Online (2020). https://doi.org/10.18653/v1/2020.acl-main.718, https://www.aclweb.org/anthology/2020.acl-main.718
Fung, Y., et al.: InfoSurgeon: cross-media fine-grained information consistency checking for fake news detection. In: Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers). pp. 1683–1698. Association for Computational Linguistics, Online (2021). https://doi.org/10.18653/v1/2021.acl-long.133, https://aclanthology.org/2021.acl-long.133
Grishman, R., Sundheim, B.: Message understanding conference- 6: a brief history. In: COLING 1996 Volume 1: The 16th International Conference on Computational Linguistics (1996). https://aclanthology.org/C96-1079
Kipf, T.N., Welling, M.: Semi-supervised classification with graph convolutional networks. CoRR abs/1609.02907 (2016). http://arxiv.org/abs/1609.02907
Li, M., et al.: GAIA: A fine-grained multimedia knowledge extraction system. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics: System Demonstrations. pp. 77–86. Association for Computational Linguistics, Online (2020). https://doi.org/10.18653/v1/2020.acl-demos.11, https://www.aclweb.org/anthology/2020.acl-demos.11
Li, S., Ji, H., Han, J.: Document-level event argument extraction by conditional generation. In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. pp. 894–908. Association for Computational Linguistics, Online (2021). https://doi.org/10.18653/v1/2021.naacl-main.69, https://www.aclweb.org/anthology/2021.naacl-main.69
Li, X., Li, F., Pan, L., Chen, Y., Peng, W., Wang, Q., Lyu, Y., Zhu, Y.: Duee: a large-scale dataset for Chinese event extraction in real-world scenarios. In: Zhu, X., Zhang, M., Hong, Y., He, R. (eds.) Natural Language Processing and Chinese Computing, pp. 534–545. Springer International Publishing, Cham (2020). https://doi.org/10.1007/978-3-319-73618-110.1007/978-3-319-73618-1
Li, Z., Ding, X., Liu, T.: Constructing narrative event evolutionary graph for script event prediction. In: Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence. pp. 4201–4207. International Joint Conferences on Artificial Intelligence Organization, Stockholm, Sweden (Jul 2018). https://doi.org/10.24963/ijcai.2018/584, https://www.ijcai.org/proceedings/2018/584
Lin, Y., Ji, H., Huang, F., Wu, L.: A joint neural model for information extraction with global features. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. pp. 7999–8009. Association for Computational Linguistics, Online (Jul 2020). https://doi.org/10.18653/v1/2020.acl-main.713, https://aclanthology.org/2020.acl-main.713
Liu, J., Chen, Y., Liu, K., Bi, W., Liu, X.: Event extraction as machine reading comprehension. In: Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP). pp. 1641–1651. Association for Computational Linguistics, Online (Nov 2020). https://doi.org/10.18653/v1/2020.emnlp-main.128, https://aclanthology.org/2020.emnlp-main.128
Vaswani, A., et al.: Attention is all you need. In: Proceedings of the 31st International Conference on Neural Information Processing Systems. vol. 30, pp. 5998–6008 (2017)
Wadden, D., Wennberg, U., Luan, Y., Hajishirzi, H.: Entity, relation, and event extraction with contextualized span representations. In: Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP). pp. 5784–5789. Association for Computational Linguistics, Hong Kong, China (Nov 2019). https://doi.org/10.18653/v1/D19-1585, https://aclanthology.org/D19-1585
Xu, R., Liu, T., Li, L., Chang, B.: Document-level event extraction via heterogeneous graph-based interaction model with a tracker. In: Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers). pp. 3533–3546. Association for Computational Linguistics, Online (2021). https://doi.org/10.18653/v1/2021.acl-long.274, https://aclanthology.org/2021.acl-long.274
Yang, H., Chen, Y., Liu, K., Xiao, Y., Zhao, J.: DCFEE: a document-level chinese financial event extraction system based on automatically labeled training data. In: Proceedings of ACL 2018, System Demonstrations. pp. 50–55. Association for Computational Linguistics, Melbourne, Australia (2018). https://doi.org/10.18653/v1/P18-4009, http://aclweb.org/anthology/P18-4009
Zheng, S., Cao, W., Xu, W., Bian, J.: Doc2EDAG: an end-to-end document-level framework for chinese financial event extraction. In: Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP). pp. 337–346. Association for Computational Linguistics, Hong Kong, China (2019). https://doi.org/10.18653/v1/D19-1032, https://www.aclweb.org/anthology/D19-1032
Zhu, T., et al.: Efficient document-level event extraction via pseudo-trigger-aware pruned complete graph (2021)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Han, C., Zhang, J., Li, X., Xu, G., Peng, W., Zeng, Z. (2022). DuEE-Fin: A Large-Scale Dataset for Document-Level Event Extraction. In: Lu, W., Huang, S., Hong, Y., Zhou, X. (eds) Natural Language Processing and Chinese Computing. NLPCC 2022. Lecture Notes in Computer Science(), vol 13551. Springer, Cham. https://doi.org/10.1007/978-3-031-17120-8_14
Download citation
DOI: https://doi.org/10.1007/978-3-031-17120-8_14
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-17119-2
Online ISBN: 978-3-031-17120-8
eBook Packages: Computer ScienceComputer Science (R0)