Abstract
Named entity recognition (NER) is a common task in the field of natural language processing, but it remains more challenging in Chinese due to the lack of natural delimiters. Recently, lots of works incorporate external lexicon into character-level Chinese NER, which focus on how to integrate the matched words in the lexicon into a specific model like LSTM or Transformer. However, in this case, the performance strongly depends on the quality of lexicon and the matching between lexicon and corpora. In reality, there are definitely some noises in the words provided by lexicon, being unhelpful for Chinese NER. To address this issue, in this paper, we propose a simple but effective multi-task learning method with helpful word selection for lexicon-enhanced Chinese NER. One task is to score the matched words and select top-K more helpful ones of them. The other task is to integrate the selected words by multi-head attention network and further implement Chinese NER by character-level sequence labeling. The two tasks are jointly learned with the same encoder. A series of experiments are conducted on three public datasets, demonstrating that the proposed method outperforms the recent advanced baselines.
Similar content being viewed by others
Data Availability
In this paper, we conducted the experiments based on three public datasets: Ontonotes4, Weibo and Resume. The data availability statements are as follows.
- Ontonotes4 that supports the findings of this study is available from Linguistic Data Consortium (https://catalog.ldc.upenn.edu/LDC2011T03) but restrictions apply to the availability of these data, which were used under license for the current study, and so are not publicly available. Data are however available from the authors upon reasonable request and with permission of Linguistic Data Consortium.
- Weibo is released by the published paper (doi: 10.18653/v1/D15-1064) and can be downloaded from https://github.com/hltcoe/golden-horse.
Resume is released by the published paper (doi: 10.18653/v1/P18-1144) and can be downloaded from https://github.com/jiesutd/LatticeLSTM.
References
Cetoli A, Bragaglia S, O’Harney AD, Sloan M (2018) Graph convolutional networks for named entity recognition. In: Proceedings of the 16th international workshop on treebanks and linguistic theories, Prague, Czech Republic, January 23-24, pp 37–45
Chen C, Kong F (2021) Enhancing entity boundary detection for better Chinese named entity recognition. In: Proceedings of the 59th annual meeting of the association for computational linguistics and the 11th international joint conference on natural language processing, Virtual Event, August 1-6, pp 20–25
Chen Y, Xu L, Liu K, Zeng D, Zhao J (2015) Event extraction via dynamic multi-pooling convolutional neural networks. In: Proceedings of the 53rd annual meeting of the association for computational linguistics and the 7th international joint conference on natural language, Beijing, China, July 26-31, pp 167–176
Chiu JPC, Nichols E (2016) Named entity recognition with bidirectional lstm-cnns. Trans Assoc Comput Linguistics 4:357–370
Collobert R, Weston J, Bottou L, Karlen M, Kavukcuoglu K, Kuksa PP (2011) Natural language processing (almost) from scratch. J Mach Learn Res 12:2493–2537
Cui Y, Che W, Liu T, Qin B, Yang Z (2021) Pre-training with whole word masking for Chinese BERT. IEEE ACM Trans Audio Speech Lang Process 29:3504–3514
Devlin J., Chang M., Lee K., Toutanova K. (2019) Bert: pre-training of deep bidirectional transformers for language understanding. In: Proceedings of the 17th conference of the north american chapter of the association for computational linguistics: human language technologies, Minneapolis, MN, USA, June 2-7, pp 4171–4186
Ding R, Xie P, Zhang X, Lu W, Li L, Si L (2019) A neural multi-digraph model for Chinese NER with gazetteers. In: Proceedings of the 57th conference of the association for computational linguistics, Florence, Italy, July 28- August 2, pp 1462–1467
Elman JL (1990) Finding structure in time. Cogn Sci 14(2):179–211
Gu Y, Qu X, Wang Z, Zheng Y, Huai B, Yuan NJ (2022) Delving deep into regularity: a simple but effective method for Chinese named entity recognition. In: Findings of the association for computational linguistics, seattle, WA, United States, July 10-15, pp 1863–1873
Gui T, Ma R, Zhang Q, Zhao L, Jiang Y, Huang X (2019) Cnn-based Chinese NER with lexicon rethinking. In: Proceedings of the 28th international joint conference on artificial intelligence, Macao, China, August 10-16, pp 4982–4988
Gui T, Zou Y, Zhang Q, Peng M, Fu J, Wei Z, Huang X (2019) A lexicon-based graph neural network for Chinese NER. In: Proceedings of the 2019 conference on empirical methods in natural language processing and the 9th international joint conference on natural language processing, Hong Kong, China, November 3-7, pp 1040–1050
Hochreiter S, Schmidhuber J (1997) Long short-term memory. Neural Comput 9(8):1735–1780
Hu B, Huang Z, Hu M, Zhang Z, Dou Y (2022) Adaptive threshold selective self-attention for Chinese NER. In: Proceedings of the 29th international conference on computational linguistics, Gyeongju, Republic of Korea, October 12-17, pp 1823–1833
Huang Z, Xu W, Yu K (2015) Bidirectional LSTM-CRF models for sequence tagging. CoRR. arXiv:1508.01991
Jin G, Chen X (2008) The fourth international Chinese language processing bakeoff: Chinese word segmentation, named entity recognition and Chinese POS tagging. In: Proceedings of the 3rd international joint conference on natural language processing, Hyderabad, India, January 7-12, pp 69–81
Joulin A, Grave E, Bojanowski P, Mikolov T (2017) Bag of tricks for efficient text classification. In: Proceedings of the 15th conference of the european chapter of the association for computational linguistics, Valencia, Spain, April 3-7, pp 427–431
Kingma DP, Ba J (2015) Adam: a method for stochastic optimization. In: Proceedings of the 3rd international conference on learning representations, San Diego, CA, USA, May 7–9
Kipf TN, Welling M (2017) Semi-supervised classification with graph convolutional networks. In: Proceedings of the 5th international conference on learning representations, Toulon, France, April 24–26
Lafferty JD, McCallum A, Pereira FCN (2001) Conditional random fields: probabilistic models for segmenting and labeling sequence data. In: Proceedings of the 18th international conference on machine learning, williams college, Williamstown, MA, USA, June 28 – July 1, pp. 282–289
Levow G (2006) The third international Chinese language processing bakeoff: word segmentation and named entity recognition. In: Proceedings of the 5th workshop on chinese language processing, Sydney, Australia, July 22–23, pp 108–117
Li H, Hagiwara M, Li Q, Ji H (2014) Comparison of the impact of word segmentation on name tagging for Chinese and Japanese. In: Proceedings of the 9th international conference on language resources and evaluation, Reykjavik, Iceland, May 26–31, pp 2532–2536
Li S, Zhao Z, Hu R, Li W, Liu T, Du X (2018) Analogical reasoning on Chinese morphological and semantic relations. In: Proceedings of the 56th annual meeting of the association for computational linguistics, Melbourne, Australia, July 15–20, pp 138–143
Li X, Yan H, Qiu X, Huang X (2020) FLAT : Chinese NER using flat-lattice transformer. In: Proceedings of the 58th annual meeting of the association for computational linguistics, July 5–10, pp. 6836–6842
Liu M, Tu Z, Wang Z, Xu X (2020) LTP: a new active learning strategy for bert-crf based named entity recognition. CoRR. arXiv:2001.02524
Liu Z, Zhu C, Zhao T (2010) Chinese named entity recognition with a sequence labeling approach: based on characters, or based on words?. In: Proceedings of the 6th International Conference on Intelligent Computing, Changsha, China, August 18–21, pp. 634–640
Lothritz C, Allix K, Veiber L, Bissyandé TF, Klein J (2020) Evaluating pretrained transformer-based models on the task of fine-grained named entity recognition. In: Proceedings of the 28th international conference on computational linguistics, Barcelona, Spain, December 8–13, pp 3750–3760
Ma R, Peng M, Zhang Q, Wei Z, Huang X (2020) Simplify the usage of lexicon in Chinese NER. In: Proceedings of the 58th annual meeting of the association for computational linguistics, July 5–10, pp 5951–5960
Mikolov T, Chen K, Corrado G, Dean J (2013) Efficient estimation of word representations in vector space. In: Proceedings of 1st international conference on learning representations, ICLR 2013, Scottsdale, Arizona, USA, May 2-4
Peng N, Dredze M (2015) Named entity recognition for Chinese social media with jointly trained embeddings. In: Proceedings of the 2015 conference on empirical methods in natural language processing, Lisbon, Portugal, September 17–21, pp 548– 554
Pennington J, Socher R, Manning CD (2014) Glove: global vectors for word representation. In: Proceedings of the 2014 conference on empirical methods in natural language processing, Doha, Qatar, October 25–29, pp 1532–1543
Riedel S, Yao L, McCallum A, Marlin BM (2013) Relation extraction with matrix factorization and universal schemas. In: Proceedings of human language technologies: conference of the North American chapter of the association of computational linguistics, Atlanta, Georgia, USA, June 9–14, pp 74–84
Ronran C, Lee S (2020) Effect of character and word features in bidirectional LSTM-CRF for NER. In: Proceedings of the 2020 IEEE international conference on big data and smart computing, Busan, Korea (South), February 19–22, pp 613–616
Song Y, Shi S, Li J, Zhang H (2018) Directional skip-gram: explicitly distinguishing left and right context for word embeddings. In: Proceedings of the 2018 conference of the north american chapter of the association for computational linguistics: human language technologies, New Orleans, Louisiana, USA, June 1–6, pp 175–180
Sui D, Chen Y, Liu K, Zhao J, Liu S (2019) Leverage lexical knowledge for Chinese named entity recognition via collaborative graph network. In: Proceedings of the 2019 conference on empirical methods in natural language processing and the 9th international joint conference on natural language processing, Hong Kong, China, November 3–7, pp 3828–3838
Tang Z, Wan B, Yang L (2020) Word-character graph convolution network for Chinese named entity recognition. IEEE ACM Trans Audio Speech Lang Process 28:1520–1532
Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser L, Polosukhin I (2017) Attention is all you need. In: Annual conference on neural information processing systems, long beach, CA, USA, December 4–9, pp 5998–6008
Weischedel R, Pradhan S, Ramshaw L, Palmer M, Xue N, Marcus M, Taylor A, Greenberg C, Hovy E, Belvin R (2011) OntoNotes Release 4.0. Philadelphia, Penn Linguistic Data Consortium
Wu S, Song X, Feng Z (2021) MECT : multi-metadata embedding based cross-transformer for Chinese named entity recognition. In: Proceedings of the 59th annual meeting of the association for computational linguistics and the 11th international joint conference on natural language processing, Virtual Event, August 1–6, pp 1529–1539
Xue M, Yu B, Liu T, Zhang Y, Meng E, Wang B (2020) Porous lattice transformer encoder for Chinese NER. In: Proceedings of the 28th international conference on computational linguistics, Barcelona, Spain, December 8–13, pp 3831–3841
Yan H, Deng B, Li X, Qiu X (2019) TENER : adapting transformer encoder for named entity recognition. CoRR. arXiv:1911.04474
Yang J, Yang R, Wang C, Xie J (2018) Multi-entity aspect-based sentiment analysis with context, entity and aspect memory. In: Proceedings of the 32nd AAAI conference on artificial intelligence, the 30th innovative applications of artificial intelligence, and the 8th AAAI symposium on educational advances in artificial intelligence, New Orleans, Louisiana, USA, February 2–7, pp 6029–6036
Zelenko D, Aone C, Richardella A (2003) Kernel methods for relation extraction. J Mach Learn Res 3:1083–1106
Zhang Y, Yang J (2018) Chinese NER using lattice LSTM. In: Proceedings of the 56th annual meeting of the association for computational linguistics, Melbourne, Australia, July 15–20, pp 1554–1564
Zhang Z, Han X, Liu Z, Jiang X, Sun M, Liu Q (2019) ERNIE: enhanced language representation with informative entities. In: Proceedings of the 57th conference of the association for computational linguistics, Florence, Italy, July 28– August 2, pp. 1441–1451
Zhao S, Hu M, Cai Z, Chen H, Liu F (2021) Dynamic modeling cross- and self-lattice attention network for Chinese NER. In: Proceedings of the 35th AAAI conference on artificial intelligence, Virtual Event, February 2–9, pp 14,515–14,523
Zhu H, Hu W, Zeng Y (2019) Flexner: a flexible LSTM - CNN stack framework for named entity recognition. In: Proceedings of the 8th CCF international conference on natural language processing and Chinese computing, Dunhuang, China, October 9–14, pp 168–178
Zhu P, Cheng D, Yang F, Luo Y, Huang D, Qian W, Zhou A (2022) Improving Chinese named entity recognition by large-scale syntactic dependency graph. IEEE ACM Trans Audio Speech Lang Process 30:979–991
Acknowledgements
This work was supported by in part by the National Natural Science Foundation of China under Grant 62207002, China Postdoctoral Science Foundation under Grant 2022TQ0040, 2022M720486, and National Natural Science Foundation of China under Grant U1911201.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Competing interests
- The authors have no relevant financial or non-financial interests to disclose.
- The authors have no competing interests to declare that are relevant to the content of this article.
- All authors certify that they have no affiliations with or involvement in any organization or entity with any financial interest or non-financial interest in the subject matter or materials discussed in this manuscript.
- The authors have no financial or proprietary interests in any material discussed in this article.
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Tian, X., Bu, X. & He, L. Multi-task learning with helpful word selection for lexicon-enhanced Chinese NER. Appl Intell 53, 19028–19043 (2023). https://doi.org/10.1007/s10489-023-04464-0
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10489-023-04464-0