Abstract
Currently, with the continuous development of relation extraction tasks, we notice that the ability to extract nontaxonomic relations has improved frustratingly slowly, and the only relation extraction dataset in the field of public opinion is the New York Times dataset (NYT) annotated by distant supervision. This paper simultaneously addresses two issues. We first propose a new model that is tailored for nontaxonomic relation extraction, which combines a context-aware model with a weighted graph convolutional network (WGCN) model characterized by dependency trees. It effectively blends contextual and dependent structural information. We further apply a pruning strategy to the input tree so that the model can effectively retain valid information and delete redundant information. Then, we build a supervised Chinese relation extraction dataset, XUNRED (Xinjiang University Nontaxonomic Relation Extraction Dataset), which is obtained after manually tagging the Baidu Encyclopedia, Baidu Post Bar and Baidu Information Flow text, and address the nontaxonomic relation in the public opinion domain. The experimental results on this new dataset show that our model can combine the contextual information with the structural information in the dependency tree better than other models.
Similar content being viewed by others
Notes
Lexical analysis document provided by iFLYTEK open platform https://www.xfyun.cn/doc/nlp/lexicalAnalysis/API.html
References
Geng ZQ, Chen GF, Han YM, Gang L, Li F (2020) Semantic relation extraction using sequential and tree-structured LSTM with attention. Inf Sci 509:183–192. https://doi.org/10.1016/j.ins.2019.09.006
Zhou L, Wang T, Qu H et al (2020) A Weighted GCN with Logical Adjacency Matrix for Relation Extraction[C]. ECAI 2020 - 24th European Conference on Artificial Intelligence
Zhang Y, Qi P (2018) Manning, christopher d graph convolution over pruned dependency trees improves relation extraction. in :proceedings of the 2018 conference on empirical methods in natural language processing. Association for computational linguistics. Brussels, pp 2205–2215. https://doi.org/10.18653/v1/D18-1244
Zhang M, Zhou GD, Aw A (2008) Exploring syntactic structured features over parse trees for relation extraction using kernel methods. Information Processing & Management 44(2):687–701. https://doi.org/10.1016/j.ipm.2007.07.013
Choi M, Kim H (2013) Social relation extraction from texts using a support-vector-machine-based dependency trigram kernel. Information Processing & Management 49(1):303–311. https://doi.org/10.1016/j.ipm.2012.04.002
McDonald RT, Pereira F, Kulick S, Winters RS, Jin Y, White PS (2005) Simple algorithms for complex relation extraction with applications to biomedical ie. In: Proceedings of the 43rd Annual Meeting of the Association for ComputationalLinguistics (ACL’O5). Association for Computational Linguistics, Ann Arbor, pp 491–498. https://doi.org/10.3115/1219840.1219901
Mintz M, Bills S, Snow R, Jurafsky D (2009) Distant supervision for relation extraction without labeled data. In: Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP. Association for Computational Linguistics, Suntec, pp 1003–1011. https://www.aclweb.org/anthology/P09-1113
Di Z, Wang J, Lin H, Wang X, Yang Z, Zhang Y (2021) Biomedical cross-sentence relation extraction via multihead attention and graph convolutional networks. Appl Soft Comput, 104. https://doi.org/10.1016/j.asoc.2021.107230
Li P, Mao K (2019) Knowledge-oriented convolutional neural network for causal relation extraction from natural language texts. Expert Syst Appl 115:512–523. https://doi.org/10.1016/j.eswa.2018.08.009
Wang L, Cao Z, Melo G-d, Liu Z (2016) Relation classification via multi-level attention cnns. In :Proceedings of the 54th Annual Meeting of the Association for ComputationalLinguistics (Volume 1: Long Papers). Association for Computational Linguistics. Berlin, pp 1298–1307. https://doi.org/10.18653/v1/P16-1123
He Z, Chen W, Li Z, Zhang W, Shao H, Zhang M (2019) Syntax-aware entity representations for neural relation extraction. Artif Intell 275:602–617. https://doi.org/10.1016/j.artint.2019.07.004
Santoso J, Setiawan EI, Purwanto CN, Yuniarno EM, Hariadi M, Purnomo MH (2021) Named entity recognition for extracting concept in ontology building on Indonesian language using end-to-end bidirectional long short term memory. Expert Systems with Applications 176:114856. https://doi.org/10.1016/j.eswa.2021.114856
Ngoc Thang V, Adel H, Gupta P, Schütze H (2016) Combining recurrent and convolutional neural networks for relation classification. In :Proceedings of the 2016 Conference of the North American Chapter of theAssociation for Computational linguistics: Human Language Technologies. Association for Computational Linguistics. San Diego, pp 534–539. https://doi.org/10.18653/v1/N16-1065
Han J, Wang H (2021) Transformer based network for Open Information Extraction. Engineering Applications of Artificial Intelligence 102:104262. https://doi.org/10.1016/j.engappai.2021.104262
Yan X, Mou L, Ge L, Chen Y, Peng H, Jin Z (2015) Classifying relations via long short term memory networks along shortest dependency paths. Inproceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pages 1785–1794. arXiv:1508.03720
Liu Y, Wei F, Li S, Ji Heng, Zhou Ming, Wang Houfeng (2015) A dependency-based neural network for relation classification. In :Proceedings of the 53rd Annual Meeting of the Association for ComputationalLinguistics and the 7th International Joint Conference on Natural LanguageProcessing (Volume 2: Short Papers). Association for Computational Linguistics. Beijing, pp 285–290. https://doi.org/10.3115/v1/P15-2047
Miwa M, Bansal M (2016) End-to-End Relation Extraction using LSTMs on Sequences and Tree Structures. Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics, Volume 1: Long Papers
Alt C, Hübner M, Hennig L (2019) Improving relation extraction by pre-trained language representations Inproceedings of the 2019 Conference on Automated Knowledge Base Construction. Amherst, Massachusetts
Shah SM, Taju SW, Ho Q-T, Nguyen T-T-D, Yu-Yen O (2021) GT-Finder: Classify the family of glucose transporters with pre-trained BERT language models. Comput Biol Med 131:104259. https://doi.org/10.1016/j.compbiomed.2021.104259
Joshi M, Chen D, Liu Y et al (2020) SpanBERT: Improving Pre-training by Representing and Predicting Spans[J]. In :Transactions of the Association tor Computational Linguistics. Volume 8, 64–77. https://doi.org/10.1162/tacl_a_00300
Zhang Z, Han X, Liu Z, Jiang X, Sun M, Liu Q (2019) ERNIE: Enhanced Language representation with informative entities. Inproceedings of the 57th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, Florence, ppp 1441–1451. https://doi.org/10.18653/v1/P19-1139
Peters ME, Neumann M, Logan R, Schwartz R, Joshi V, Singh S, Smith NA (2019) Knowledge enhanced contextual word representations. In :Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLPIJCNLP). Association for Computational Linguistics, Hong Kong, pp 43–54. https://doi.org/10.18653/v1/D19-1005
Soares LB, FitzGerald N, Ling J, Kwiatkowski T (2019) Matching the blanks: Distributional similarity for relation learning. In :Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, Florence, pp 2895–2905. https://doi.org/10.18653/v1/P19-1279
Doddington G, Mitchell A, Przybocki M, Ramshaw L, Strassel S, Weischedel R (2004) The automatic content extraction (ACE) program-tasks, data, and evaluation. Inproceedings of the Fourth International Conference on Language Resources andEvaluation (LREC’o4). European Language Resources Association (ELRA). Lisbon, pp 837–840
Walker C, Strassel S, Medero J et al (2006) ACE 2005 Multilingual training Corpus[J], vol 110
Riedel Sebastian, Yao Limin, McCallum Andrew (2010) Modeling relations and their mentions without labeled text. Inproceedings of ECML-PKDD, pp 148–163
Han X, Zhu H, Pengfei Y, Wang Z, Yao Y, Liu Z, Sun M (2018) Fewrel: A large-scale supervised few-shot relation classification dataset with state-of-the-art evaluation. In :Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, Brussels, pp 4803–4809. https://doi.org/10.18653/v1/D18-1514
Hendrickx I, Nam Kim S, Kozareva Z, Nakov P, Séaghdha D, Padó S, Pennacchiotti M, Romano L, Szpakowicz S (2010) Semeval-2010 task 8: Multi-way classification of semantic relations between pairs of nominals. In :Proceedings of the 5th International Workshop on Semantic Evaluation. Association for Computational Linguistics, Uppsala, pp 33–38. https://www.aclweb.org/anthology/S10-1006
Zhang Y, Zhong V, Chen D, Angeli G, Manning CD (2017) Position-aware attention and supervised data improve slot filling. Inproceedings of the 2017 Conference on Empirical Methods in Natural Language Processing (EMNLP),pages 35–45
Kim J-D, Ohta T, Pyysalo S, oshinobu Kano Y, Tsujii J (2009) Overview of bioNLP’09 shared task on event extraction. In :Proceedings of the bioNLP 2009 Workshop Companion Volume for Shared Task. Association for Computational Linguistics. Boulder, pp 1–9. https://www.aclweb.org/anthology/W09-1401
Yao Y, Ye D, Li P, Han X, Lin Y, Liu Z, Liu Z, Huang L, Zhou J, Sun M (2019) DocRED: A large-scale document-level relation extraction dataset. In :Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, Florence, pp 764–777. https://doi.org/10.18653/v1/P19-1074
Zhou Y, Pan L, Bai C, Luo S, Wu Z (2021) Self-selective attention using correlation between instances for distant supervision relation extraction. Neural Networks. https://doi.org/10.1016/j.neunet.2021.04.032
Gori M, Monfardini G, Scarselli F (2005) A new model for learning in graph domains. In :Proceedings. 2005 IEEE International Joint Conference on Neural Networks. https://doi.org/10.1109/IJCNN.2005.1555942
Qi C, Zhang J, Jia H, Mao Q, Wang L, Song H (2021) Deep face clustering using residual graph convolutional network. Knowl-Based Syst 211:106561. https://doi.org/10.1016/j.knosys.2020.106561
Tsu-Jui F, Li P-H, Ma W-Y (2019) Graphrel: Modeling text as relational graphs for joint entity and relation extraction. Inproceedings of the 57th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics. Florence, pp 1409–1418. https://doi.org/10.18653/v1/P19-1136
Kipf TN, Welling M (2017) Semi-supervised classification with graph convolutional networks. International Conference on Learning Representations (ICLR). arXiv:1609.02907
Li S, Zhao Z, Renfen H, Li W, Liu T, Xiaoyong D (2018) Analogical Reasoning on Chinese Morphological and Semantic Relations. In :Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 2:Short Papers). Association for Computational Linguistics. Melbourne, pp 138–143. https://doi.org/10.18653/v1/P18-2023
Sorokin D, Gurevych I (2017) Context-aware representations for knowledge base relation extraction. In :Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics. Copenhagen, pp 1784–1789. https://doi.org/10.18653/v1/D17-1188
Hochreiter S, Schmidhuber J (1997) Long short-term memory. Neural Computation 9:1735–1780. https://doi.org/10.1162/neco.1997.9.8.1735
Cai R, Zhang X, Wang H (2016) Bidirectional recurrent convolutional neural network for relation classification. In :Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1:Long Papers). Association for Computational Linguistics, Berlin, pp 756–765. https://doi.org/10.18653/v1/P16-1072
Acknowledgements
This work, based on the paper “Graph Convolution over Pruned Dependency Trees Improves Relation Extraction”, provides easy-to-read and easy-to-implement experimental codes to help us perform research and experiments. Thank you for the service provided by the iFLYTEK open platform in the process of building our dataset. This work is supported partly by the National Natural Science Foundation of China (No. 61966034).
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Wang, G., Liu, S. & Wei, F. Weighted graph convolution over dependency trees for nontaxonomic relation extraction on public opinion information. Appl Intell 52, 3403–3417 (2022). https://doi.org/10.1007/s10489-021-02596-9
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10489-021-02596-9