Multi-task learning with helpful word selection for lexicon-enhanced Chinese NER

Tian, Xuetao; Bu, Xiaoxuan; He, Lu

doi:10.1007/s10489-023-04464-0

Multi-task learning with helpful word selection for lexicon-enhanced Chinese NER

Published: 17 February 2023

Volume 53, pages 19028–19043, (2023)
Cite this article

Applied Intelligence Aims and scope Submit manuscript

518 Accesses
2 Citations
1 Altmetric
Explore all metrics

Abstract

Named entity recognition (NER) is a common task in the field of natural language processing, but it remains more challenging in Chinese due to the lack of natural delimiters. Recently, lots of works incorporate external lexicon into character-level Chinese NER, which focus on how to integrate the matched words in the lexicon into a specific model like LSTM or Transformer. However, in this case, the performance strongly depends on the quality of lexicon and the matching between lexicon and corpora. In reality, there are definitely some noises in the words provided by lexicon, being unhelpful for Chinese NER. To address this issue, in this paper, we propose a simple but effective multi-task learning method with helpful word selection for lexicon-enhanced Chinese NER. One task is to score the matched words and select top-K more helpful ones of them. The other task is to integrate the selected words by multi-head attention network and further implement Chinese NER by character-level sequence labeling. The two tasks are jointly learned with the same encoder. A series of experiments are conducted on three public datasets, demonstrating that the proposed method outperforms the recent advanced baselines.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Unify the Usage of Lexicon in Chinese Named Entity Recognition

Lexicon enhanced Chinese named entity recognition with pointer network

Article 09 May 2022

A chinese named entity recognition method for small-scale dataset based on lexicon and unlabeled data

Article 21 June 2022

Data Availability

In this paper, we conducted the experiments based on three public datasets: Ontonotes4, Weibo and Resume. The data availability statements are as follows.

- Ontonotes4 that supports the findings of this study is available from Linguistic Data Consortium (https://catalog.ldc.upenn.edu/LDC2011T03) but restrictions apply to the availability of these data, which were used under license for the current study, and so are not publicly available. Data are however available from the authors upon reasonable request and with permission of Linguistic Data Consortium.

- Weibo is released by the published paper (doi: 10.18653/v1/D15-1064) and can be downloaded from https://github.com/hltcoe/golden-horse.

Resume is released by the published paper (doi: 10.18653/v1/P18-1144) and can be downloaded from https://github.com/jiesutd/LatticeLSTM.

Notes

References

Cetoli A, Bragaglia S, O’Harney AD, Sloan M (2018) Graph convolutional networks for named entity recognition. In: Proceedings of the 16th international workshop on treebanks and linguistic theories, Prague, Czech Republic, January 23-24, pp 37–45
Chen C, Kong F (2021) Enhancing entity boundary detection for better Chinese named entity recognition. In: Proceedings of the 59th annual meeting of the association for computational linguistics and the 11th international joint conference on natural language processing, Virtual Event, August 1-6, pp 20–25
Chen Y, Xu L, Liu K, Zeng D, Zhao J (2015) Event extraction via dynamic multi-pooling convolutional neural networks. In: Proceedings of the 53rd annual meeting of the association for computational linguistics and the 7th international joint conference on natural language, Beijing, China, July 26-31, pp 167–176
Chiu JPC, Nichols E (2016) Named entity recognition with bidirectional lstm-cnns. Trans Assoc Comput Linguistics 4:357–370
Article Google Scholar
Collobert R, Weston J, Bottou L, Karlen M, Kavukcuoglu K, Kuksa PP (2011) Natural language processing (almost) from scratch. J Mach Learn Res 12:2493–2537
MATH Google Scholar
Cui Y, Che W, Liu T, Qin B, Yang Z (2021) Pre-training with whole word masking for Chinese BERT. IEEE ACM Trans Audio Speech Lang Process 29:3504–3514
Article Google Scholar
Devlin J., Chang M., Lee K., Toutanova K. (2019) Bert: pre-training of deep bidirectional transformers for language understanding. In: Proceedings of the 17th conference of the north american chapter of the association for computational linguistics: human language technologies, Minneapolis, MN, USA, June 2-7, pp 4171–4186
Ding R, Xie P, Zhang X, Lu W, Li L, Si L (2019) A neural multi-digraph model for Chinese NER with gazetteers. In: Proceedings of the 57th conference of the association for computational linguistics, Florence, Italy, July 28- August 2, pp 1462–1467
Elman JL (1990) Finding structure in time. Cogn Sci 14(2):179–211
Article Google Scholar
Gu Y, Qu X, Wang Z, Zheng Y, Huai B, Yuan NJ (2022) Delving deep into regularity: a simple but effective method for Chinese named entity recognition. In: Findings of the association for computational linguistics, seattle, WA, United States, July 10-15, pp 1863–1873
Gui T, Ma R, Zhang Q, Zhao L, Jiang Y, Huang X (2019) Cnn-based Chinese NER with lexicon rethinking. In: Proceedings of the 28th international joint conference on artificial intelligence, Macao, China, August 10-16, pp 4982–4988
Gui T, Zou Y, Zhang Q, Peng M, Fu J, Wei Z, Huang X (2019) A lexicon-based graph neural network for Chinese NER. In: Proceedings of the 2019 conference on empirical methods in natural language processing and the 9th international joint conference on natural language processing, Hong Kong, China, November 3-7, pp 1040–1050
Hochreiter S, Schmidhuber J (1997) Long short-term memory. Neural Comput 9(8):1735–1780
Article Google Scholar
Hu B, Huang Z, Hu M, Zhang Z, Dou Y (2022) Adaptive threshold selective self-attention for Chinese NER. In: Proceedings of the 29th international conference on computational linguistics, Gyeongju, Republic of Korea, October 12-17, pp 1823–1833
Huang Z, Xu W, Yu K (2015) Bidirectional LSTM-CRF models for sequence tagging. CoRR. arXiv:1508.01991
Jin G, Chen X (2008) The fourth international Chinese language processing bakeoff: Chinese word segmentation, named entity recognition and Chinese POS tagging. In: Proceedings of the 3rd international joint conference on natural language processing, Hyderabad, India, January 7-12, pp 69–81
Joulin A, Grave E, Bojanowski P, Mikolov T (2017) Bag of tricks for efficient text classification. In: Proceedings of the 15th conference of the european chapter of the association for computational linguistics, Valencia, Spain, April 3-7, pp 427–431
Kingma DP, Ba J (2015) Adam: a method for stochastic optimization. In: Proceedings of the 3rd international conference on learning representations, San Diego, CA, USA, May 7–9
Kipf TN, Welling M (2017) Semi-supervised classification with graph convolutional networks. In: Proceedings of the 5th international conference on learning representations, Toulon, France, April 24–26
Lafferty JD, McCallum A, Pereira FCN (2001) Conditional random fields: probabilistic models for segmenting and labeling sequence data. In: Proceedings of the 18th international conference on machine learning, williams college, Williamstown, MA, USA, June 28 – July 1, pp. 282–289
Levow G (2006) The third international Chinese language processing bakeoff: word segmentation and named entity recognition. In: Proceedings of the 5th workshop on chinese language processing, Sydney, Australia, July 22–23, pp 108–117
Li H, Hagiwara M, Li Q, Ji H (2014) Comparison of the impact of word segmentation on name tagging for Chinese and Japanese. In: Proceedings of the 9th international conference on language resources and evaluation, Reykjavik, Iceland, May 26–31, pp 2532–2536
Li S, Zhao Z, Hu R, Li W, Liu T, Du X (2018) Analogical reasoning on Chinese morphological and semantic relations. In: Proceedings of the 56th annual meeting of the association for computational linguistics, Melbourne, Australia, July 15–20, pp 138–143
Li X, Yan H, Qiu X, Huang X (2020) FLAT : Chinese NER using flat-lattice transformer. In: Proceedings of the 58th annual meeting of the association for computational linguistics, July 5–10, pp. 6836–6842
Liu M, Tu Z, Wang Z, Xu X (2020) LTP: a new active learning strategy for bert-crf based named entity recognition. CoRR. arXiv:2001.02524
Liu Z, Zhu C, Zhao T (2010) Chinese named entity recognition with a sequence labeling approach: based on characters, or based on words?. In: Proceedings of the 6th International Conference on Intelligent Computing, Changsha, China, August 18–21, pp. 634–640
Lothritz C, Allix K, Veiber L, Bissyandé TF, Klein J (2020) Evaluating pretrained transformer-based models on the task of fine-grained named entity recognition. In: Proceedings of the 28th international conference on computational linguistics, Barcelona, Spain, December 8–13, pp 3750–3760
Ma R, Peng M, Zhang Q, Wei Z, Huang X (2020) Simplify the usage of lexicon in Chinese NER. In: Proceedings of the 58th annual meeting of the association for computational linguistics, July 5–10, pp 5951–5960
Mikolov T, Chen K, Corrado G, Dean J (2013) Efficient estimation of word representations in vector space. In: Proceedings of 1st international conference on learning representations, ICLR 2013, Scottsdale, Arizona, USA, May 2-4
Peng N, Dredze M (2015) Named entity recognition for Chinese social media with jointly trained embeddings. In: Proceedings of the 2015 conference on empirical methods in natural language processing, Lisbon, Portugal, September 17–21, pp 548– 554
Pennington J, Socher R, Manning CD (2014) Glove: global vectors for word representation. In: Proceedings of the 2014 conference on empirical methods in natural language processing, Doha, Qatar, October 25–29, pp 1532–1543
Riedel S, Yao L, McCallum A, Marlin BM (2013) Relation extraction with matrix factorization and universal schemas. In: Proceedings of human language technologies: conference of the North American chapter of the association of computational linguistics, Atlanta, Georgia, USA, June 9–14, pp 74–84
Ronran C, Lee S (2020) Effect of character and word features in bidirectional LSTM-CRF for NER. In: Proceedings of the 2020 IEEE international conference on big data and smart computing, Busan, Korea (South), February 19–22, pp 613–616
Song Y, Shi S, Li J, Zhang H (2018) Directional skip-gram: explicitly distinguishing left and right context for word embeddings. In: Proceedings of the 2018 conference of the north american chapter of the association for computational linguistics: human language technologies, New Orleans, Louisiana, USA, June 1–6, pp 175–180
Sui D, Chen Y, Liu K, Zhao J, Liu S (2019) Leverage lexical knowledge for Chinese named entity recognition via collaborative graph network. In: Proceedings of the 2019 conference on empirical methods in natural language processing and the 9th international joint conference on natural language processing, Hong Kong, China, November 3–7, pp 3828–3838
Tang Z, Wan B, Yang L (2020) Word-character graph convolution network for Chinese named entity recognition. IEEE ACM Trans Audio Speech Lang Process 28:1520–1532
Article Google Scholar
Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser L, Polosukhin I (2017) Attention is all you need. In: Annual conference on neural information processing systems, long beach, CA, USA, December 4–9, pp 5998–6008
Weischedel R, Pradhan S, Ramshaw L, Palmer M, Xue N, Marcus M, Taylor A, Greenberg C, Hovy E, Belvin R (2011) OntoNotes Release 4.0. Philadelphia, Penn Linguistic Data Consortium
Wu S, Song X, Feng Z (2021) MECT : multi-metadata embedding based cross-transformer for Chinese named entity recognition. In: Proceedings of the 59th annual meeting of the association for computational linguistics and the 11th international joint conference on natural language processing, Virtual Event, August 1–6, pp 1529–1539
Xue M, Yu B, Liu T, Zhang Y, Meng E, Wang B (2020) Porous lattice transformer encoder for Chinese NER. In: Proceedings of the 28th international conference on computational linguistics, Barcelona, Spain, December 8–13, pp 3831–3841
Yan H, Deng B, Li X, Qiu X (2019) TENER : adapting transformer encoder for named entity recognition. CoRR. arXiv:1911.04474
Yang J, Yang R, Wang C, Xie J (2018) Multi-entity aspect-based sentiment analysis with context, entity and aspect memory. In: Proceedings of the 32nd AAAI conference on artificial intelligence, the 30th innovative applications of artificial intelligence, and the 8th AAAI symposium on educational advances in artificial intelligence, New Orleans, Louisiana, USA, February 2–7, pp 6029–6036
Zelenko D, Aone C, Richardella A (2003) Kernel methods for relation extraction. J Mach Learn Res 3:1083–1106
MathSciNet MATH Google Scholar
Zhang Y, Yang J (2018) Chinese NER using lattice LSTM. In: Proceedings of the 56th annual meeting of the association for computational linguistics, Melbourne, Australia, July 15–20, pp 1554–1564
Zhang Z, Han X, Liu Z, Jiang X, Sun M, Liu Q (2019) ERNIE: enhanced language representation with informative entities. In: Proceedings of the 57th conference of the association for computational linguistics, Florence, Italy, July 28– August 2, pp. 1441–1451
Zhao S, Hu M, Cai Z, Chen H, Liu F (2021) Dynamic modeling cross- and self-lattice attention network for Chinese NER. In: Proceedings of the 35th AAAI conference on artificial intelligence, Virtual Event, February 2–9, pp 14,515–14,523
Zhu H, Hu W, Zeng Y (2019) Flexner: a flexible LSTM - CNN stack framework for named entity recognition. In: Proceedings of the 8th CCF international conference on natural language processing and Chinese computing, Dunhuang, China, October 9–14, pp 168–178
Zhu P, Cheng D, Yang F, Luo Y, Huang D, Qian W, Zhou A (2022) Improving Chinese named entity recognition by large-scale syntactic dependency graph. IEEE ACM Trans Audio Speech Lang Process 30:979–991
Article Google Scholar

Download references

Acknowledgements

This work was supported by in part by the National Natural Science Foundation of China under Grant 62207002, China Postdoctoral Science Foundation under Grant 2022TQ0040, 2022M720486, and National Natural Science Foundation of China under Grant U1911201.

Author information

Authors and Affiliations

Faculty of Psychology, Beijing Normal University, Beijing, China
Xuetao Tian & Xiaoxuan Bu
School of Computer and Information Technology, Beijing Jiaotong University, Beijing, China
Lu He

Authors

Xuetao Tian
View author publications
You can also search for this author in PubMed Google Scholar
Xiaoxuan Bu
View author publications
You can also search for this author in PubMed Google Scholar
Lu He
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Xuetao Tian.

Ethics declarations

Competing interests

- The authors have no relevant financial or non-financial interests to disclose.

- The authors have no competing interests to declare that are relevant to the content of this article.

- All authors certify that they have no affiliations with or involvement in any organization or entity with any financial interest or non-financial interest in the subject matter or materials discussed in this manuscript.

- The authors have no financial or proprietary interests in any material discussed in this article.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Tian, X., Bu, X. & He, L. Multi-task learning with helpful word selection for lexicon-enhanced Chinese NER. Appl Intell 53, 19028–19043 (2023). https://doi.org/10.1007/s10489-023-04464-0

Download citation

Accepted: 08 January 2023
Published: 17 February 2023
Issue Date: August 2023
DOI: https://doi.org/10.1007/s10489-023-04464-0

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Multi-task learning with helpful word selection for lexicon-enhanced Chinese NER

Abstract

Access this article

Similar content being viewed by others

Unify the Usage of Lexicon in Chinese Named Entity Recognition

Lexicon enhanced Chinese named entity recognition with pointer network

A chinese named entity recognition method for small-scale dataset based on lexicon and unlabeled data

Data Availability

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Competing interests

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Multi-task learning with helpful word selection for lexicon-enhanced Chinese NER

Abstract

Access this article

Similar content being viewed by others

Unify the Usage of Lexicon in Chinese Named Entity Recognition

Lexicon enhanced Chinese named entity recognition with pointer network

A chinese named entity recognition method for small-scale dataset based on lexicon and unlabeled data

Data Availability

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Competing interests

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation