skip to main content
10.1145/3573942.3574055acmotherconferencesArticle/Chapter ViewAbstractPublication PagesaiprConference Proceedingsconference-collections
research-article

A Chinese Named Entity Recognition Method Fusing Word and Radical Features

Published:16 May 2023Publication History

ABSTRACT

Named Entity Recognition (NER) is a subtask of natural language processing. Its accuracy is crucial for downstream tasks. In Chinese NER, word information is often added to enhance the semantic and boundary information of Chinese words, but these methods ignore the radical information of Chinese characters. This paper propose a multi-feature fusion model(MFFM) for Chinese NER. First, the input sequences are exported to the BERT layer, the word embedding layer and the radical embedding layer respectively; then the above three layer output are combined together as input of the Bidirectional Long Short-Term Memory(BiLSTM) layer to model the contextual information; finally annotate the sequence with conditional random field. The proposed model not only avoids the import of complex structures, but also effectively captures the character features of the context, thus improves the recognition performance. The experimental results show that the F1 value of MFFM reaches 71.02% on the Weibo dataset, which is 3.12% higher than that of the BERT model, and 82.78% on the OntoNotes4.0 dataset, which is 0.85% higher than that of the BERT model.

References

  1. Zhengyan Zhang, Xu Han, Zhiyuan Liu, Xin Jiang, Maosong Sun, and Qun Liu. 2019. ERNIE: Enhanced Language Representation with Informative Entities. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, Volume 1: Long Papers. Florence, Italy, 1441–1451. https://doi.org/10.18653/v1/P19-1139Google ScholarGoogle ScholarCross RefCross Ref
  2. Bogdan Babych, and Anthony Hartley. 2003. Improving machine translation quality with automatic named entity recognition. In Proceedings of the 7th International EAMT workshop on MT and other language technology tools, Improving MT through other language technology tools, Resource and tools for building MT at EACL 2003. Hungary, 1-8.Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Jing Li, Aixin Sun, Jianglei Han, and Chenliang Li. 2020. A survey on deep learning for named entity recognition. IEEE Transactions on Knowledge and Data Engineering. 34, 1 (January 2020), 50-70. https://doi.org/10.1109/TKDE.2020.2981314Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Zhiheng Huang, Wei Xu, and Kai Yu. Bidirectional LSTM-CRF Models for Sequence Tagging. Retrieved August 9, 2015 from http://arxiv.org/abs/1508.01991 .Google ScholarGoogle Scholar
  5. Marek Rei, Gamal Crichton, and Sampo Pyysalo. 2016. Attending to characters in neural sequence labeling models. In Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers. The COLING 2016 Organizing Committee. Osaka, Japan, 309–318.Google ScholarGoogle Scholar
  6. Xuezhe Ma, and Eduard Hovy. 2016. End-to-end sequence labeling via bi-directional lstm-cnns-crf. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Association for Computational Linguistics. Berlin, Germany, 1064–1074. https://doi.org/10.18653/v1/P16-1101Google ScholarGoogle ScholarCross RefCross Ref
  7. Hangfeng He, and Xu Sun. 2017. A unified model for cross-domain and semi-supervised named entity recognition in chinese social media. In Proceedings of the AAAI Conference on Artificial Intelligence. California, USA. https://doi.org/10.5555/3298023.3298036Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Yue Zhang, and Jie Yang. 2018. Chinese NER Using Lattice LSTM. In Proceeding of the 56th Annual Meeting of the Assocoation for Computational Linguistic. Melbourne, Australia,1:1554-1564. https://doi.org/10.18653/v1/P18-1144Google ScholarGoogle ScholarCross RefCross Ref
  9. Chuanhai Dong, Jiajun Zhang, Chengqing Zong, Masanori Hattori, and Hui Di. 2016. Character-based LSTM-CRF with radical-level features for Chinese named entity recognition. In Natural Language Understanding and Intelligent Applications, 239-250. Springer.Google ScholarGoogle Scholar
  10. Canwen Xu, Feiyang Wang, Jialong Han, and Chenliang Li. 2019. Exploiting multiple embeddings for chinese named entity recognition. In Proceedings of the 28th ACM international conference on information and knowledge management. Beijing, China, 2269-2272.Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2019. Bert: Pre-training of deep bidirectional transformers for language understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Association for Computational Linguistics, Volume 1 (Long and Short Papers). Minneapolis, Minnesota, 4171–4186.Google ScholarGoogle Scholar
  12. Tao Gui, Yicheng Zou, Qi Zhang, Minlong Peng, Jinlan Fu, Zhongyu Wei, and Xuanjing Huang. 2019. A lexicon-based graph neural network for Chinese NER. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP). Hong Kong, China, 1040-1050. https://doi.org/10.18653/v1/D19-1096Google ScholarGoogle ScholarCross RefCross Ref
  13. Xiaonan Li, Hang Yan, Xipeng Qiu, and Xuanjing Huang. 2020. FLAT: Chinese NER Using Flat-Lattice Transformer. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. Online, 6836–6842. https://doi.org/10.18653/v1/2020.acl-main.611Google ScholarGoogle ScholarCross RefCross Ref
  14. Cijian Song, Yan Xiong, Wenchao Huang, and Lu Ma. 2020. Joint self-attention and multi-embeddings for chinese named entity recognition. In Proceedings of the 6th International Conference on Big Data Computing and Communications (BIGCOM). IEEE, DeQing, China, 76-80. https://doi.org/10.1109/BigCom51056.2020.00017Google ScholarGoogle ScholarCross RefCross Ref
  15. Shuang Wu, Xiaoning Song, and Zhenhua Feng. 2021. Mect: Multi-metadata embedding based cross-transformer for chinese named entity recognition. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1:Long papers). Association for Computational Linguistics, Online, 1529-1539. https://doi.org/10.18653/v1/2021.acl-long.121Google ScholarGoogle ScholarCross RefCross Ref
  16. Ganesh Jawahar, Benoît Sagot, and Djamé Seddah. 2019. What Does BERT Learn about the Structure of Language?. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics , Florence, Italy, 3651–3657. https://doi.org/10.18653/v1/P19-1356Google ScholarGoogle ScholarCross RefCross Ref
  17. Vikas Yadav, Rebecca Sharp, and Steven Bethard. 2018. Deep affix features improve neural named entity recognizers. In Proceedings of the seventh joint conference on lexical and computational semantics. Association for Computational Linguistics, New Orleans, Louisiana, 167-172. https://doi.org/10.18653/v1/S18-2021Google ScholarGoogle ScholarCross RefCross Ref
  18. Yanran Li, Wenjie Li, Fei Sun, and Sujian Li. 2015. Component-enhanced Chinese character embeddings. In Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics. Lisbon, Portugal, 829–834. https://doi.org/10.18653/v1/D15-1098Google ScholarGoogle ScholarCross RefCross Ref
  19. Nanyun Peng, and Mark Dredze. 2015, Named entity recognition for chinese social media with jointly trained embeddings. In Proceedings of the 2015 conference on empirical methods in natural language processing. Association for Computational Linguistics, Lisbon, Portugal, 548-554. https://doi.org/10.18653/v1/D15-1064Google ScholarGoogle ScholarCross RefCross Ref
  20. Ralph Weischedel, Sameer Prad-han, Lance Ramshaw, Martha Palmer, Nianwen Xue, Mitchell Marcus, Ann Taylor, Craig Greenberg, Eduard Hovy, and Robert Belvin. 2011. Ontonotes release 4.0. Retrieved February 15, 2011 from https://catalog.ldc.upenn.edu/LDC2011T03Google ScholarGoogle Scholar
  21. Yuying Zhu, and Guoxin Wang. 2019. CAN-NER: Convolutional attention network for Chinese named entity recognition. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Volume 1 (Long and Short Papers). Minneapolis, MN, USA, 3384–3393. https://doi.org/10.18653/v1/N19-1342Google ScholarGoogle ScholarCross RefCross Ref
  22. Ruotian Ma, Minlong Peng, Qi Zhang, and Xuanjing Huang. 2020. Simplify the Usage of Lexicon in Chinese NER. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, Online, 5951–5960. https://doi.org/10.18653/v1/2020.acl-main.528Google ScholarGoogle ScholarCross RefCross Ref

Index Terms

  1. A Chinese Named Entity Recognition Method Fusing Word and Radical Features

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Other conferences
      AIPR '22: Proceedings of the 2022 5th International Conference on Artificial Intelligence and Pattern Recognition
      September 2022
      1221 pages
      ISBN:9781450396899
      DOI:10.1145/3573942

      Copyright © 2022 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 16 May 2023

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article
      • Research
      • Refereed limited

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    HTML Format

    View this article in HTML Format .

    View HTML Format