Abstract
Knowledge Graphs (KGs) are pivotal for effectively organizing and managing structured information across various applications. Financial KGs have been successfully employed in advancing applications such as audit, anti-fraud, and anti-money laundering. Despite their success, the construction of Chinese financial KGs has seen limited research due to the complex semantics. A significant challenge is the overlap triples problem, where entities feature in multiple relations within a sentence, hampering extraction accuracy–more than 39% of the triples in Chinese datasets exhibit the overlap triples. To address this, we propose the Entity-type-Enriched Cascaded Neural Network (E2CNN), leveraging special tokens for entity boundaries and types. E2CNN ensures consistency in entity types and excludes specific relations, mitigating overlap triple problems and enhancing relation extraction. Besides, we introduce the available Chinese financial dataset FinCorpus.CN, annotated from annual reports of 2,000 companies, containing 48,389 entities and 23,368 triples. Experimental results on the DUIE dataset and FinCorpus.CN underscore E2CNN’s superiority over state-of-the-art models.
Article PDF
Similar content being viewed by others
Avoid common mistakes on your manuscript.
References
Saxena A, Chakrabarti S, Talukdar P. Question answering over temporal knowledge graphs. In: Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing. 2021, 6663–6676
Zhang M, He T T, Dong M. Meta-path reasoning of knowledge graph for commonsense question answering. Frontiers of Computer Science, 2024, 18(1): 181303
Collarana D, Galkin M, Traverso-Ribón I, Lange C, Vidal M E, Auer S. Semantic data integration for knowledge graph construction at query time. In: Proceedings of the 11th IEEE International Conference on Semantic Computing. 2017, 109–116
Kim J, Choi K S. Unsupervised fact checking by counter-weighted positive and negative evidential paths in a knowledge graph. In: Proceedings of the 28th International Conference on Computational Linguistics. 2020, 1677–1686
Sang E F T K, De Meulder F. Introduction to the CoNLL-2003 shared task: language-independent named entity recognition. In: Proceedings of the 7th Conference on Natural Language Learning at HLT-NAACL 2003. 2003, 142–147
Ratinov L, Roth D. Design challenges and misconceptions in named entity recognition. In: Proceedings of the 13th Conference on Computational Natural Language Learning. 2009, 147–155
Zelenko D, Aone C, Richardella A. Kernel methods for relation extraction. In: Proceedings of 2002 Conference on Empirical Methods in Natural Language Processing. 2002, 71–78
Bunescu R C, Mooney R J. A shortest path dependency kernel for relation extraction. In: Proceedings of Human Language Technology Conference and Conference on Empirical Methods in Natural Language Processing. 2005, 724–731
Zheng Z Y, Liu Y, Li D, Zhang X J. Distant supervised relation extraction based on residual attention. Frontiers of Computer Science, 2022, 16(6): 166336
Haussmann S, Seneviratne O, Chen Y, Ne’eman Y, Codella J, Chen C H, McGuinness D L, Zaki M J. FoodKG: a semantics-driven knowledge graph for food recommendation. In: Proceedings of the 18th International Semantic Web Conference. 2019, 146–162
Kumar A, Bharadwaj A G, Starly B, Lynch C. FabKG: a knowledge graph of manufacturing science domain utilizing structured and unconventional unstructured knowledge source. In: Proceedings of the Workshop on Structured and Unstructured Knowledge Integration (SUKI). 2022, 1–8
Kang Y Z, Jia N, Cui R B, Deng J. A graph-based semi-supervised reject inference framework considering imbalanced data distribution for consumer credit scoring. Applied Soft Computing, 2021, 105: 107259
Van Belle R, Mitrović S, De Weerdt J. Representation learning in graphs for credit card fraud detection. In: Proceedings of the 4th ECML PKDD Workshop on Mining Data for Financial Applications. 2020, 32–46
Zhan Q, Yin H. A loan application fraud detection method based on knowledge graph and neural network. In Proceedings of the 2nd International Conference on Innovation in Artificial Intelligence. 2018, 111–115
Iglesias-Molina A, Chaves-Fraga D, Priyatna F, Corcho O. Towards the definition of a language-independent mapping template for knowledge graph creation. In: Proceedings of the 3rd International Workshop on Capturing Scientific Knowledge Co-located with the 10th International Conference on Knowledge Capture. 2019, 33–36
Wei Z, Su J, Wang Y, Tian Y, Chang Y. A novel cascade binary tagging framework for relational triple extraction. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. 2020, 1476–1488
Wang G, Zeng Y, Li R H, Qin H, Shi X, Xia Y, Shang X, Hong L. Temporal graph cube. IEEE Transactions on Knowledge and Data Engineering, 2023, 35(12): 13015–13030
Cui Y, Che W, Liu T, Qin B, Wang S, Hu G. Revisiting pre-trained models for Chinese natural language processing. In: Proceedings of Findings of the Association for Computational Linguistics: EMNLP 2020. 2020, 657–668
Xu W, Chen Y, Ouyang J. A streamlined span-based factorization method for few shot named entity recognition. In: Proceedings of 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation. 2024, 1673–1683
Zhong Z, Chen D. A frustratingly easy approach for entity and relation extraction. In: Proceedings of 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. 2021, 50–61
Dixit K, Al-Onaizan Y. Span-level model for relation extraction. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. 2019, 5308–5314
Yu J, Bohnet B, Poesio M. Named entity recognition as dependency parsing. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. 2020, 6470–6476
Yan H, Sun Y, Li X, Qiu X. An embarrassingly easy but strong baseline for nested named entity recognition. In: Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics. 2023, 1442–1452
Li S, He W, Shi Y, Jiang W, Liang H, Jiang Y, Zhang Y, Lyu Y, Zhu Y. DuIE: a large-scale Chinese dataset for information extraction. In: Proceedings of the 8th CCF International Conference on Natural Language Processing and Chinese Computing. 2019, 791–800
Wadden D, Wennberg U, Luan Y, Hajishirzi H. Entity, relation, and event extraction with contextualized span representations. In: Proceedings of 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing. 2019, 5784–5789
Ye D, Lin Y, Li P, Sun M. Packed levitated marker for entity and relation extraction. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics. 2022, 4904–4917
Ren F, Zhang L, Zhao X, Yin S, Liu S, Li B. A simple but effective bidirectional framework for relational triple extraction. In: Proceedings of the 15th ACM International Conference on Web Search and Data Mining. 2022, 824–832
Ning J, Yang Z, Sun Y, Wang Z, Lin H. OD-RTE: a one-stage object detection framework for relational triple extraction. In: Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics. 2023, 11120–11135
Dai H, Peng X, Shi X, He L, Xiong Q, Jin H. Reveal training performance mystery between TensorFlow and PyTorch in the single GPU environment. Science China Information Sciences, 2022, 65: 112103
Lei Z, Ul Haq A, Zeb A, Suzauddola M, Zhang D. Is the suggested food your desired? Multi-modal recipe recommendation with demand-based knowledge graph. Expert Systems with Applications, 2021, 186: 115708
Zehra S, Mohsin S F M, Wasi S, Jami S I, Siddiqui M S, Syed M K U R R. Financial knowledge graph based financial report query system. IEEE Access, 2021, 9: 69766–69782
Rony M R A H, Chaudhuri D, Usbeck R, Lehmann J. Tree-KGQA: an unsupervised approach for question answering over knowledge graphs. IEEE Access, 2022, 10: 50467–50478
Shang C, Wang G, Qi P, J H. Improving time sensitivity for question answering over temporal knowledge graphs. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics. 2022, 8017–8026
Lin P, Song Q, Wu Y. Fact checking in knowledge graphs with ontological subgraph patterns. Data Science and Engineering, 2018, 3(4): 341–358
Lin P, Song Q, Wu Y, Pi J. Discovering patterns for fact checking in knowledge graphs. Journal of Data and Information Quality, 2019, 11(3): 13
Cudré-Mauroux P. Leveraging knowledge graphs for big data integration: the XI pipeline. Semantic Web, 2020, 11(1): 13–17
Dandan R, Despres S. DIKG2: a semantic data integration approach for knowledge graphs generation from Web forms. In: Proceedings of the 34th International Conference on Industrial, Engineering and Other Applications of Applied Intelligent Systems. 2021, 255–260
Lehmann J, Isele R, Jakob M, Jentzsch A, Kontokostas D, Mendes P N, Hellmann S, Morsey M, van Kleef P, Auer S, Bizer C. DBpedia–a large-scale, multilingual knowledge base extracted from Wikipedia. Semantic Web, 2015, 6(2): 167–195
Suchanek F M, Kasneci G, Weikum G. Yago: a core of semantic knowledge. In: Proceedings of the 16th International Conference on World Wide Web. 2007, 697–706
Mitchell T, Cohen W, Hruschka E, Talukdar P, Yang B, Betteridge J, Carlson A, Dalvi B, Gardner M, Kisiel B, Krishnamurthy J, Lao N, Mazaitis K, Mohamed T, Nakashole N, Platanios E, Ritter A, Samadi M, Settles B, Wang R, Wijaya D, Gupta A, Chen X, Saparov A, Greaves M, Welling J. Never-ending learning. Communications of the ACM, 2018, 61(5): 103–115
Elhammadi S, Lakshmanan L V S, Ng R, Simpson M, Huai B, Wang Z, Wang L. A high precision pipeline for financial knowledge graph construction. In: Proceedings of the 28th International Conference on Computational Linguistics. 2020, 967–977
Kim K, Hur Y, Kim G, Lim H. GREG: a global level relation extraction with knowledge graph embedding. Applied Sciences, 2020, 10(3): 1181
McCallum A, Freitag D, Pereira F C N. Maximum entropy Markov models for information extraction and segmentation. In: Proceedings of the 17th International Conference on Machine Learning. 2000, 591–598
Brin S. Extracting patterns and relations from the world wide web. In: Proceedings of International Workshop on the World Wide Web and Databases. 1998, 172–183
Hasegawa T, Sekine S, Grishman R. Discovering relations among named entities from large corpora. In: Proceedings of the 42nd Annual Meeting of the Association for Computational Linguistics. 2004, 415–422
Gong C, Li Z, Xia Q, Chen W, Zhang M. Hierarchical LSTM with char-subword-word tree-structure representation for Chinese named entity recognition. Science China Information Sciences, 2020, 63(10): 202102
Mintz M, Bills S, Snow R, Jurafsky D. Distant supervision for relation extraction without labeled data. In: Proceedings of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP. 2009, 1003–1011
Gormley M R, Yu M, Dredze M. Improved relation extraction with feature-rich compositional embedding models. In: Proceedings of 2015 Conference on Empirical Methods in Natural Language Processing. 2015, 1774–1784
Florian R, Hassan H, Ittycheriah A, Jing H, Kambhatla N, Luo X, Nicolov N, Roukos S. A statistical model for multilingual entity detection and tracking. In: Proceedings of Human Language Technology Conference of the North American Chapter of the Association for Computational Linguistics. 2004, 1–8
Florian R, Jing H, Kambhatla N, Zitouni I. Factorizing complex models: a case study in mention detection. In: Proceedings of the 21st International Conference on Computational Linguistics and 44th Annual Meeting of the Association for Computational Linguistics. 2006, 473–480
Zeng D, Liu K, Lai S, Zhou G, Zhao J. Relation classification via convolutional deep neural network. In: Proceedings of the 25th International Conference on Computational Linguistics. 2014, 2335–2344
Zhou P, Shi W, Tian J, Qi Z, Li B, Hao H, Xu B. Attention-based bidirectional long short-term memory networks for relation classification. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics. 2016, 207–212
Qiao B, Zou Z, Huang Y, Fang K, Zhu X, Chen Y. A joint model for entity and relation extraction based on BERT. Neural Computing and Applications, 2022, 34(5): 3471–3481
Zheng S, Wang F, Bao H, Hao Y, Zhou P, Xu B. Joint extraction of entities and relations based on a novel tagging scheme. In: Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics. 2017, 1227–1236
Bekoulis G, Deleu J, Demeester T, Develder C. Joint entity recognition and relation extraction as a multi-head selection problem. Expert Systems with Applications, 2018, 114: 34–45
Yu X, Lam W. Jointly identifying entities and extracting relations in encyclopedia text via a graphical model approach. In: Proceedings of the 23rd International Conference on Computational Linguistics. 2010, 1399–1407
Li Q, Ji H. Incremental joint extraction of entity mentions and relations. In: Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics. 2014, 402–412
Gupta P, Schütze H, Andrassy B. Table filling multi-task recurrent neural network for joint entity and relation extraction. In: Proceedings of the 26th International Conference on Computational Linguistics. 2016, 2537–2547
Katiyar A, Cardie C. Going out on a limb: joint extraction of entity mentions and relations without dependency trees. In: Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics. 2017, 917–928
Zeng X, Zeng D, He S, Liu K, Zhao J. Extracting relational facts by an end-to-end neural model with copy mechanism. In: Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics. 2018, 506–514
Miwa M, Bansal M. End-to-end relation extraction using LSTMs on sequences and tree structures. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics. 2016, 1105–1116
Nie Y, Tian Y, Song Y, Ao X, Wan X. Improving named entity recognition with attentive ensemble of syntactic information. In: Proceedings of Findings of the Association for Computational Linguistics. 2020, 4231–4245
Li Z, Ding N, Liu Z, Zheng H, Shen Y. Chinese relation extraction with multi-grained information and external linguistic knowledge. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. 2019, 4377–4386
Xuan Z, Bao R, Jiang S. FGN: fusion glyph network for Chinese named entity recognition. In: Proceedings of the 5th China Conference on Knowledge Graph and Semantic Computing: Knowledge Graph and Cognitive Intelligence. 2020, 28–40
Fu T J, Li P H, Ma W Y. GraphRel: modeling text as relational graphs for joint entity and relation extraction. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. 2019, 1409–1418
Shang Y M, Huang H, Mao X. OneRel: joint entity and relation extraction with one module in one step. In: Proceedings of the 36th AAAI Conference on Artificial Intelligence. 2022, 11285–11293
Wang Y, Yu B, Zhang Y, Liu T, Zhu H, Sun L. TPLinker: single-stage joint extraction of entities and relations through token pair linking. In: Proceedings of the 28th International Conference on Computational Linguistics. 2020, 1572–1582
Acknowledgements
This work was supported in part by the National Key R&D Program of China (Grant No. 2020AAA0108501).
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Competing interests The authors declare that they have no competing interests or financial conflicts to disclose.
Additional information
Mengfan Li received the BS degree from the School of Computer Science and Engineering, Northeastern University, China in 2021. She is currently working towards the PhD degree with the School of Computer Science and Technology, Huazhong University of Science and Technology, China. Her research interests include large language models, graph neural networks, and knowledge graphs.
Xuanhua Shi is a professor in National Engineering Research Center for Big Data Technology and System/ Services Computing Technology and System Lab, Huazhong University of Science and Technology, China. He received the PhD degree in computer engineering from Huazhong University of Science and Technology, China in 2005. From 2006, he worked as an INRIA Post-Doc in PARIS team at Rennes for one year. His current research interests focus on the cloud computing and big data processing. He published over 100 peer-reviewed publications, received research support from a variety of governmental and industrial organizations, such as the National Natural Science Foundation of China, the Ministry of Science and Technology of China, the Ministry of Education of China, the European Union, Alibaba, ByteDance, and Intel. Shi is a senior member of IEEE and CCF.
Chenqi Qiao received the BS degree from the Huazhong University of Science and Technology (HUST), China in 2022. He is now working toward the MS degree in the School of Computer Science and Technology, HUST, China.
Xiao Huang is an assistant professor in the Department of Computing at The Hong Kong Polytechnic University (PolyU), China. He received BS in Engineering from Shanghai Jiao Tong University, China in 2012, MS in Electrical Engineering from Illinois Institute of Technology, USA in 2015, and PhD in Computer Engineering from Texas A&M University, USA in 2020. His research interests include large language models, graph neural networks, knowledge graphs, network anomaly detection, and recommender systems. He has received over 2,600 citations. He is a program committee member of ICLR 2022–2024, NeurIPS 2021–2023, AAAI 2021–2023, KDD 2019–2023, TheWebConf 2022–2023, ICML 2021–2023, IJCAI 2020–2023, CIKM 2019–2022, and WSDM 2021–2023. Before joining PolyU, China he worked as a research intern at Microsoft Research and Baidu USA.
Weihao Wang received the MS degree from Huazhong University of Science and Technology, China. His research interest lies in knowledge graphs.
Yao Wan received his PhD degree from the College of Computer Science, Zhejiang University, China in 2019. He is currently a lecturer at the College of Computer Science and Technology, Huazhong University of Science and Technology, China. He has been a visiting student of the University of Technology Sydney, Australia and the University of Illinois at Chicago, USA in 2016 and 2018, respectively. His research interests lie in the synergy between artificial intelligence and software engineering, especially natural language processing, programming languages, software engineering, and data mining.
Teng Zhang is an associate professor of the College of Computer Science and Technology, Huazhong University of Science and Technology, China. He received his PhD degree from the Department of Computer Science and Technology, Nanjing University, China in 2019. His research interests include machine learning, data mining, artificial intelligence, and optimization. He is a program committee member of ICML 2018–2023, NeurIPS 2018–2023, IJCAI 2017–2023, AAAI 2017–2024, KDD 2017–2023 and an associate editor of the Frontiers of Computer Science. He is the best paper award owner of 2020 IEEE BigComp.
Hai Jin received his PhD degree in computer engineering from Huazhong University of Science and Technology, China in 1994. He received German Academic Exchange Service fellowship to visit the Technical University of Chemnitz, Germany in 1996. He worked at the University of Hong Kong, China between 1998 and 2000, and as a visiting scholar at the University of Southern California, USA between 1999 and 2000. He received the Excellent Youth Award from the National Natural Science Foundation of China in 2001. He is a Cheung Kung Scholars chair professor of computer science and engineering of Huazhong University of Science and Technology, the chief scientist of ChinaGrid, the largest grid computing project in China, and the chief scientist of National 973 Basic Research Program Project of Virtualization Technology of Computing System, and Cloud Security. He has coauthored 22 books and published over 800 research papers. His research interests include computer architecture, virtualization technology, cluster computing and cloud computing, peer-to-peer computing, network storage, and network security. He is a fellow of the CCF, a fellow of the IEEE and a member of the ACM.
Electronic Supplementary Material
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made.
The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.
To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Li, M., Shi, X., Qiao, C. et al. E2CNN: entity-type-enriched cascaded neural network for Chinese financial relation extraction. Front. Comput. Sci. 19, 1910352 (2025). https://doi.org/10.1007/s11704-024-3983-6
Received:
Accepted:
Published:
DOI: https://doi.org/10.1007/s11704-024-3983-6