skip to main content
10.1145/3573942.3574055acmotherconferencesArticle/Chapter ViewAbstractPublication PagesaiprConference Proceedingsconference-collections
research-article

A Chinese Named Entity Recognition Method Fusing Word and Radical Features

Published: 16 May 2023 Publication History

Abstract

Named Entity Recognition (NER) is a subtask of natural language processing. Its accuracy is crucial for downstream tasks. In Chinese NER, word information is often added to enhance the semantic and boundary information of Chinese words, but these methods ignore the radical information of Chinese characters. This paper propose a multi-feature fusion model(MFFM) for Chinese NER. First, the input sequences are exported to the BERT layer, the word embedding layer and the radical embedding layer respectively; then the above three layer output are combined together as input of the Bidirectional Long Short-Term Memory(BiLSTM) layer to model the contextual information; finally annotate the sequence with conditional random field. The proposed model not only avoids the import of complex structures, but also effectively captures the character features of the context, thus improves the recognition performance. The experimental results show that the F1 value of MFFM reaches 71.02% on the Weibo dataset, which is 3.12% higher than that of the BERT model, and 82.78% on the OntoNotes4.0 dataset, which is 0.85% higher than that of the BERT model.

References

[1]
Zhengyan Zhang, Xu Han, Zhiyuan Liu, Xin Jiang, Maosong Sun, and Qun Liu. 2019. ERNIE: Enhanced Language Representation with Informative Entities. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, Volume 1: Long Papers. Florence, Italy, 1441–1451. https://doi.org/10.18653/v1/P19-1139
[2]
Bogdan Babych, and Anthony Hartley. 2003. Improving machine translation quality with automatic named entity recognition. In Proceedings of the 7th International EAMT workshop on MT and other language technology tools, Improving MT through other language technology tools, Resource and tools for building MT at EACL 2003. Hungary, 1-8.
[3]
Jing Li, Aixin Sun, Jianglei Han, and Chenliang Li. 2020. A survey on deep learning for named entity recognition. IEEE Transactions on Knowledge and Data Engineering. 34, 1 (January 2020), 50-70. https://doi.org/10.1109/TKDE.2020.2981314
[4]
Zhiheng Huang, Wei Xu, and Kai Yu. Bidirectional LSTM-CRF Models for Sequence Tagging. Retrieved August 9, 2015 from http://arxiv.org/abs/1508.01991 .
[5]
Marek Rei, Gamal Crichton, and Sampo Pyysalo. 2016. Attending to characters in neural sequence labeling models. In Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers. The COLING 2016 Organizing Committee. Osaka, Japan, 309–318.
[6]
Xuezhe Ma, and Eduard Hovy. 2016. End-to-end sequence labeling via bi-directional lstm-cnns-crf. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Association for Computational Linguistics. Berlin, Germany, 1064–1074. https://doi.org/10.18653/v1/P16-1101
[7]
Hangfeng He, and Xu Sun. 2017. A unified model for cross-domain and semi-supervised named entity recognition in chinese social media. In Proceedings of the AAAI Conference on Artificial Intelligence. California, USA. https://doi.org/10.5555/3298023.3298036
[8]
Yue Zhang, and Jie Yang. 2018. Chinese NER Using Lattice LSTM. In Proceeding of the 56th Annual Meeting of the Assocoation for Computational Linguistic. Melbourne, Australia,1:1554-1564. https://doi.org/10.18653/v1/P18-1144
[9]
Chuanhai Dong, Jiajun Zhang, Chengqing Zong, Masanori Hattori, and Hui Di. 2016. Character-based LSTM-CRF with radical-level features for Chinese named entity recognition. In Natural Language Understanding and Intelligent Applications, 239-250. Springer.
[10]
Canwen Xu, Feiyang Wang, Jialong Han, and Chenliang Li. 2019. Exploiting multiple embeddings for chinese named entity recognition. In Proceedings of the 28th ACM international conference on information and knowledge management. Beijing, China, 2269-2272.
[11]
Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2019. Bert: Pre-training of deep bidirectional transformers for language understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Association for Computational Linguistics, Volume 1 (Long and Short Papers). Minneapolis, Minnesota, 4171–4186.
[12]
Tao Gui, Yicheng Zou, Qi Zhang, Minlong Peng, Jinlan Fu, Zhongyu Wei, and Xuanjing Huang. 2019. A lexicon-based graph neural network for Chinese NER. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP). Hong Kong, China, 1040-1050. https://doi.org/10.18653/v1/D19-1096
[13]
Xiaonan Li, Hang Yan, Xipeng Qiu, and Xuanjing Huang. 2020. FLAT: Chinese NER Using Flat-Lattice Transformer. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. Online, 6836–6842. https://doi.org/10.18653/v1/2020.acl-main.611
[14]
Cijian Song, Yan Xiong, Wenchao Huang, and Lu Ma. 2020. Joint self-attention and multi-embeddings for chinese named entity recognition. In Proceedings of the 6th International Conference on Big Data Computing and Communications (BIGCOM). IEEE, DeQing, China, 76-80. https://doi.org/10.1109/BigCom51056.2020.00017
[15]
Shuang Wu, Xiaoning Song, and Zhenhua Feng. 2021. Mect: Multi-metadata embedding based cross-transformer for chinese named entity recognition. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1:Long papers). Association for Computational Linguistics, Online, 1529-1539. https://doi.org/10.18653/v1/2021.acl-long.121
[16]
Ganesh Jawahar, Benoît Sagot, and Djamé Seddah. 2019. What Does BERT Learn about the Structure of Language?. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, Florence, Italy, 3651–3657. https://doi.org/10.18653/v1/P19-1356
[17]
Vikas Yadav, Rebecca Sharp, and Steven Bethard. 2018. Deep affix features improve neural named entity recognizers. In Proceedings of the seventh joint conference on lexical and computational semantics. Association for Computational Linguistics, New Orleans, Louisiana, 167-172. https://doi.org/10.18653/v1/S18-2021
[18]
Yanran Li, Wenjie Li, Fei Sun, and Sujian Li. 2015. Component-enhanced Chinese character embeddings. In Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics. Lisbon, Portugal, 829–834. https://doi.org/10.18653/v1/D15-1098
[19]
Nanyun Peng, and Mark Dredze. 2015, Named entity recognition for chinese social media with jointly trained embeddings. In Proceedings of the 2015 conference on empirical methods in natural language processing. Association for Computational Linguistics, Lisbon, Portugal, 548-554. https://doi.org/10.18653/v1/D15-1064
[20]
Ralph Weischedel, Sameer Prad-han, Lance Ramshaw, Martha Palmer, Nianwen Xue, Mitchell Marcus, Ann Taylor, Craig Greenberg, Eduard Hovy, and Robert Belvin. 2011. Ontonotes release 4.0. Retrieved February 15, 2011 from https://catalog.ldc.upenn.edu/LDC2011T03
[21]
Yuying Zhu, and Guoxin Wang. 2019. CAN-NER: Convolutional attention network for Chinese named entity recognition. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Volume 1 (Long and Short Papers). Minneapolis, MN, USA, 3384–3393. https://doi.org/10.18653/v1/N19-1342
[22]
Ruotian Ma, Minlong Peng, Qi Zhang, and Xuanjing Huang. 2020. Simplify the Usage of Lexicon in Chinese NER. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, Online, 5951–5960. https://doi.org/10.18653/v1/2020.acl-main.528

Cited By

View all
  • (2024)Generative named entity recognition framework for Chinese legal domainPeerJ Computer Science10.7717/peerj-cs.242810(e2428)Online publication date: 4-Nov-2024
  • (2023)Multimodal Features Enhanced Named Entity Recognition Based on Self-Attention Mechanism2023 8th International Conference on Data Science in Cyberspace (DSC)10.1109/DSC59305.2023.00018(52-59)Online publication date: 18-Aug-2023

Index Terms

  1. A Chinese Named Entity Recognition Method Fusing Word and Radical Features

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Other conferences
    AIPR '22: Proceedings of the 2022 5th International Conference on Artificial Intelligence and Pattern Recognition
    September 2022
    1221 pages
    ISBN:9781450396899
    DOI:10.1145/3573942
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 16 May 2023

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. BERT
    2. Named Entity Recognition
    3. Radical information
    4. Word embedding

    Qualifiers

    • Research-article
    • Research
    • Refereed limited

    Conference

    AIPR 2022

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)13
    • Downloads (Last 6 weeks)2
    Reflects downloads up to 01 Mar 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)Generative named entity recognition framework for Chinese legal domainPeerJ Computer Science10.7717/peerj-cs.242810(e2428)Online publication date: 4-Nov-2024
    • (2023)Multimodal Features Enhanced Named Entity Recognition Based on Self-Attention Mechanism2023 8th International Conference on Data Science in Cyberspace (DSC)10.1109/DSC59305.2023.00018(52-59)Online publication date: 18-Aug-2023

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    HTML Format

    View this article in HTML Format.

    HTML Format

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media