Abstract
Large Language Models (LLMs) have significantly improved the performance of various NLP tasks. Yet, for Chinese Information Extraction (IE), LLMs can perform poorly due to the lack of fine-grained linguistic and semantic knowledge. In this paper, we propose Unified Knowledgeable Tuning (UKT), a lightweight yet effective framework that is applicable to several recently proposed Chinese IE models based on Transformer. In UKT, both linguistic and semantic knowledge is incorporated into word representations. We further propose the relational knowledge validation technique in UKT to force model to learn the injected knowledge to increase its generalization ability. We evaluate our UKT on five public datasets related to two major Chinese IE tasks. Experiments confirm the effectiveness and universality of our approach, which achieves consistent improvement over state-of-the-art models.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
- 2.
Source codes will be released in the EasyNLP framework [22].
- 3.
- 4.
- 5.
- 6.
References
Chen, G., Tian, Y., Song, Y., Wan, X.: Relation extraction with type-aware map memories of word dependencies. In: ACL-IJCNLP, pp. 2501–2512 (2021)
Cui, Y., Che, W., Liu, T., Qin, B., Yang, Z.: Pre-training with whole word masking for Chinese bert. IEEE/ACM Trans. Audio Speech Lang. Process. 29, 3504–3514 (2021)
Fundel, K., Küffner, R., Zimmer, R.: Relex—relation extraction using dependency parse trees. Bioinformatics 23(3), 365–371 (2007)
Gui, T., Ma, R., Zhang, Q., Zhao, L., Jiang, Y.G., Huang, X.: CNN-based Chinese NER with lexicon rethinking. In: IJCAI, pp. 4982–4988 (2019)
Gui, T., et al.: A lexicon-based graph neural network for Chinese NER. In: EMNLP-IJCNLP, pp. 1040–1050 (2019)
He, H., Sun, X.: F-score driven max margin neural network for named entity recognition in Chinese social media. In: EACL, pp. 713–718 (2017)
Hu, B., Huang, Z., Hu, M., Zhang, Z., Dou, Y.: Adaptive threshold selective self-attention for Chinese NER. In: COLING, pp. 1823–1833 (2022)
Levow, G.A.: The third international Chinese language processing bakeoff: word segmentation and named entity recognition. In: SIGHAN, pp. 108–117 (2006)
Li, F., Lin, Z., Zhang, M., Ji, D.: A span-based model for joint overlapped and discontinuous named entity recognition. In: ACL/IJCNLP, pp. 4814–4828 (2021)
Li, X., Yan, H., Qiu, X., Huang, X.J.: Flat: Chinese NER using flat-lattice transformer. In: ACL, pp. 6836–6842 (2020)
Li, Z., Ding, N., Liu, Z., Zheng, H., Shen, Y.: Chinese relation extraction with multi-grained information and external linguistic knowledge. In: ACL, pp. 4377–4386 (2019)
Ma, R., Peng, M., Zhang, Q., Wei, Z., Huang, X.J.: Simplify the usage of lexicon in Chinese NER. In: ACL, pp. 5951–5960 (2020)
Ma, Y., Cao, Y., Hong, Y., Sun, A.: Large language model is not a good few-shot information extractor, but a good reranker for hard samples! CoRR abs/2303.08559 (2023)
Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. In: NIPS, pp. 3111–3119 (2013)
Ouyang, L., et al.: Training language models to follow instructions with human feedback. In: NIPS, pp. 27730–27744 (2022)
Peng, N., Dredze, M.: Named entity recognition for Chinese social media with jointly trained embeddings. In: EMNLP, pp. 548–554 (2015)
Qin, H., Tian, Y., Song, Y.: Relation extraction with word graphs from N-grams. In: EMNLP, pp. 2860–2868 (2021)
Sachan, D., Zhang, Y., Qi, P., Hamilton, W.L.: Do syntax trees help pre-trained transformers extract information? In: EACL, pp. 2647–2661 (2021)
Sui, D., Chen, Y., Liu, K., Zhao, J., Liu, S.: Leverage lexical knowledge for Chinese named entity recognition via collaborative graph network. In: EMNLP-IJCNLP, pp. 3830–3840 (2019)
Vaswani, A., et al.: Attention is all you need. In: NIPS, pp. 5998–6008 (2017)
Wan, Q., Wan, C., Hu, R., Liu, D.: Chinese financial event extraction based on syntactic and semantic dependency parsing. Chin. J. Comput. 44(3), 508–530 (2021)
Wang, C., et al.: EasyNLP: a comprehensive and easy-to-use toolkit for natural language processing. In: EMNLP, pp. 22–29 (2022)
Wu, S., Song, X., FENG, Z.: Mect: multi-metadata embedding based cross-transformer for Chinese named entity recognition. In: ACL-IJCNLP, pp. 1529–1539 (2021)
Xu, J., Wen, J., Sun, X., Su, Q.: A discourse-level named entity recognition and relation extraction dataset for Chinese literature text. CoRR abs/1711.07010 (2017)
Xu, Y., Mou, L., Li, G., Chen, Y., Peng, H., Jin, Z.: Classifying relations via long short term memory networks along shortest dependency paths. In: EMNLP, pp. 1785–1794 (2015)
Zeng, A., et al.: GLM-130B: an open bilingual pre-trained model. CoRR abs/2210.02414 (2022)
Zeng, D., Liu, K., Lai, S., Zhou, G., Zhao, J.: Relation classification via convolutional deep neural network. In: COLING, pp. 2335–2344 (2014)
Zhang, T., et al.: HORNET: enriching pre-trained language representations with heterogeneous knowledge sources. In: CIKM, pp. 2608–2617 (2021)
Zhang, T., et al.: DKPLM: decomposable knowledge-enhanced pre-trained language model for natural language understanding. In: AAAI, pp. 11703–11711 (2022)
Zhang, Y., Yang, J.: Chinese NER using lattice LSTM. In: ACL, pp. 1554–1564 (2018)
Zhao, S., Hu, M., Cai, Z., Zhang, Z., Zhou, T., Liu, F.: Enhancing Chinese character representation with lattice-aligned attention. IEEE Trans. Neural Netw. Learn. Syst. 34(7), 3727–3736 (2023). https://doi.org/10.1109/TNNLS.2021.3114378
Acknowledgments
This work is supported by the Guangzhou Science and Technology Program key projects (202103010005), the National Natural Science Foundation of China (61876066) and Alibaba Cloud Group through the Research Talent Program with South China University of Technology.
Author information
Authors and Affiliations
Corresponding authors
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Zhou, J. et al. (2023). UKT: A Unified Knowledgeable Tuning Framework for Chinese Information Extraction. In: Liu, F., Duan, N., Xu, Q., Hong, Y. (eds) Natural Language Processing and Chinese Computing. NLPCC 2023. Lecture Notes in Computer Science(), vol 14303. Springer, Cham. https://doi.org/10.1007/978-3-031-44696-2_17
Download citation
DOI: https://doi.org/10.1007/978-3-031-44696-2_17
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-44695-5
Online ISBN: 978-3-031-44696-2
eBook Packages: Computer ScienceComputer Science (R0)