research-article

A Chinese Named Entity Recognition Method Fusing Word and Radical Features

Authors:

Ping LuAuthors Info & Claims

AIPR '22: Proceedings of the 2022 5th International Conference on Artificial Intelligence and Pattern Recognition

Pages 502 - 508

https://doi.org/10.1145/3573942.3574055

Published: 16 May 2023 Publication History

Abstract

Named Entity Recognition (NER) is a subtask of natural language processing. Its accuracy is crucial for downstream tasks. In Chinese NER, word information is often added to enhance the semantic and boundary information of Chinese words, but these methods ignore the radical information of Chinese characters. This paper propose a multi-feature fusion model(MFFM) for Chinese NER. First, the input sequences are exported to the BERT layer, the word embedding layer and the radical embedding layer respectively; then the above three layer output are combined together as input of the Bidirectional Long Short-Term Memory(BiLSTM) layer to model the contextual information; finally annotate the sequence with conditional random field. The proposed model not only avoids the import of complex structures, but also effectively captures the character features of the context, thus improves the recognition performance. The experimental results show that the F1 value of MFFM reaches 71.02% on the Weibo dataset, which is 3.12% higher than that of the BERT model, and 82.78% on the OntoNotes4.0 dataset, which is 0.85% higher than that of the BERT model.

References

[1]

Zhengyan Zhang, Xu Han, Zhiyuan Liu, Xin Jiang, Maosong Sun, and Qun Liu. 2019. ERNIE: Enhanced Language Representation with Informative Entities. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, Volume 1: Long Papers. Florence, Italy, 1441–1451. https://doi.org/10.18653/v1/P19-1139

[2]

Bogdan Babych, and Anthony Hartley. 2003. Improving machine translation quality with automatic named entity recognition. In Proceedings of the 7th International EAMT workshop on MT and other language technology tools, Improving MT through other language technology tools, Resource and tools for building MT at EACL 2003. Hungary, 1-8.

Digital Library

[3]

Jing Li, Aixin Sun, Jianglei Han, and Chenliang Li. 2020. A survey on deep learning for named entity recognition. IEEE Transactions on Knowledge and Data Engineering. 34, 1 (January 2020), 50-70. https://doi.org/10.1109/TKDE.2020.2981314

Digital Library

[4]

Zhiheng Huang, Wei Xu, and Kai Yu. Bidirectional LSTM-CRF Models for Sequence Tagging. Retrieved August 9, 2015 from http://arxiv.org/abs/1508.01991 .

[5]

Marek Rei, Gamal Crichton, and Sampo Pyysalo. 2016. Attending to characters in neural sequence labeling models. In Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers. The COLING 2016 Organizing Committee. Osaka, Japan, 309–318.

[6]

Xuezhe Ma, and Eduard Hovy. 2016. End-to-end sequence labeling via bi-directional lstm-cnns-crf. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Association for Computational Linguistics. Berlin, Germany, 1064–1074. https://doi.org/10.18653/v1/P16-1101

[7]

Hangfeng He, and Xu Sun. 2017. A unified model for cross-domain and semi-supervised named entity recognition in chinese social media. In Proceedings of the AAAI Conference on Artificial Intelligence. California, USA. https://doi.org/10.5555/3298023.3298036

Digital Library

[8]

Yue Zhang, and Jie Yang. 2018. Chinese NER Using Lattice LSTM. In Proceeding of the 56th Annual Meeting of the Assocoation for Computational Linguistic. Melbourne, Australia,1:1554-1564. https://doi.org/10.18653/v1/P18-1144

[9]

Chuanhai Dong, Jiajun Zhang, Chengqing Zong, Masanori Hattori, and Hui Di. 2016. Character-based LSTM-CRF with radical-level features for Chinese named entity recognition. In Natural Language Understanding and Intelligent Applications, 239-250. Springer.

[10]

Canwen Xu, Feiyang Wang, Jialong Han, and Chenliang Li. 2019. Exploiting multiple embeddings for chinese named entity recognition. In Proceedings of the 28th ACM international conference on information and knowledge management. Beijing, China, 2269-2272.

Digital Library

[11]

Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2019. Bert: Pre-training of deep bidirectional transformers for language understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Association for Computational Linguistics, Volume 1 (Long and Short Papers). Minneapolis, Minnesota, 4171–4186.

[12]

Tao Gui, Yicheng Zou, Qi Zhang, Minlong Peng, Jinlan Fu, Zhongyu Wei, and Xuanjing Huang. 2019. A lexicon-based graph neural network for Chinese NER. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP). Hong Kong, China, 1040-1050. https://doi.org/10.18653/v1/D19-1096

[13]

Xiaonan Li, Hang Yan, Xipeng Qiu, and Xuanjing Huang. 2020. FLAT: Chinese NER Using Flat-Lattice Transformer. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. Online, 6836–6842. https://doi.org/10.18653/v1/2020.acl-main.611

[14]

Cijian Song, Yan Xiong, Wenchao Huang, and Lu Ma. 2020. Joint self-attention and multi-embeddings for chinese named entity recognition. In Proceedings of the 6th International Conference on Big Data Computing and Communications (BIGCOM). IEEE, DeQing, China, 76-80. https://doi.org/10.1109/BigCom51056.2020.00017

[15]

Shuang Wu, Xiaoning Song, and Zhenhua Feng. 2021. Mect: Multi-metadata embedding based cross-transformer for chinese named entity recognition. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1:Long papers). Association for Computational Linguistics, Online, 1529-1539. https://doi.org/10.18653/v1/2021.acl-long.121

[16]

Ganesh Jawahar, Benoît Sagot, and Djamé Seddah. 2019. What Does BERT Learn about the Structure of Language?. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, Florence, Italy, 3651–3657. https://doi.org/10.18653/v1/P19-1356

[17]

Vikas Yadav, Rebecca Sharp, and Steven Bethard. 2018. Deep affix features improve neural named entity recognizers. In Proceedings of the seventh joint conference on lexical and computational semantics. Association for Computational Linguistics, New Orleans, Louisiana, 167-172. https://doi.org/10.18653/v1/S18-2021

[18]

Yanran Li, Wenjie Li, Fei Sun, and Sujian Li. 2015. Component-enhanced Chinese character embeddings. In Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics. Lisbon, Portugal, 829–834. https://doi.org/10.18653/v1/D15-1098

[19]

Nanyun Peng, and Mark Dredze. 2015, Named entity recognition for chinese social media with jointly trained embeddings. In Proceedings of the 2015 conference on empirical methods in natural language processing. Association for Computational Linguistics, Lisbon, Portugal, 548-554. https://doi.org/10.18653/v1/D15-1064

[20]

Ralph Weischedel, Sameer Prad-han, Lance Ramshaw, Martha Palmer, Nianwen Xue, Mitchell Marcus, Ann Taylor, Craig Greenberg, Eduard Hovy, and Robert Belvin. 2011. Ontonotes release 4.0. Retrieved February 15, 2011 from https://catalog.ldc.upenn.edu/LDC2011T03

[21]

Yuying Zhu, and Guoxin Wang. 2019. CAN-NER: Convolutional attention network for Chinese named entity recognition. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Volume 1 (Long and Short Papers). Minneapolis, MN, USA, 3384–3393. https://doi.org/10.18653/v1/N19-1342

[22]

Ruotian Ma, Minlong Peng, Qi Zhang, and Xuanjing Huang. 2020. Simplify the Usage of Lexicon in Chinese NER. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, Online, 5951–5960. https://doi.org/10.18653/v1/2020.acl-main.528

Cited By

Mao XJiang JZeng YPeng YZhang SLi F(2024)Generative named entity recognition framework for Chinese legal domainPeerJ Computer Science10.7717/peerj-cs.242810(e2428)Online publication date: 4-Nov-2024
https://doi.org/10.7717/peerj-cs.2428
Li CQian YPan L(2023)Multimodal Features Enhanced Named Entity Recognition Based on Self-Attention Mechanism2023 8th International Conference on Data Science in Cyberspace (DSC)10.1109/DSC59305.2023.00018(52-59)Online publication date: 18-Aug-2023
https://doi.org/10.1109/DSC59305.2023.00018

Index Terms

A Chinese Named Entity Recognition Method Fusing Word and Radical Features
1. Computing methodologies
  1. Artificial intelligence
    1. Natural language processing
      1. Information extraction

Recommendations

Chinese Named Entity Recognition with Character-Word Mixed Embedding
CIKM '17: Proceedings of the 2017 ACM on Conference on Information and Knowledge Management

Named Entity Recognition (NER) is an important basis for the tasks in natural language processing such as relation extraction, entity linking and so on. The common method of existing Chinese NER systems is to use the character sequence as the input, and ...
Chinese mineral named entity recognition based on BERT model
Abstract
Mineral named entity recognition (MNER) is the extraction for the specific types of entities from unstructured Chinese mineral text, which is a prerequisite for building a mineral knowledge graph. MNER can also provide important data ...
Highlights
- Present a BERT-based model for Chinese mineral named entity recognition.
- ...
Arabic Named Entity Recognition Using Clustered Word Embedding
Computational Linguistics and Intelligent Text Processing
Abstract
Named Entity Recognition in Arabic is a challenging topic because of morphological and lexical richness of Arabic. In this paper, we propose an Arabic NER system that is based on word embedding. Word embedding hold semantic information about the ...

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences

AIPR '22: Proceedings of the 2022 5th International Conference on Artificial Intelligence and Pattern Recognition

September 2022

1221 pages

ISBN:9781450396899

DOI:10.1145/3573942

Copyright © 2022 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 16 May 2023

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article
Research
Refereed limited

Conference

AIPR 2022

AIPR 2022: 2022 5th International Conference on Artificial Intelligence and Pattern Recognition

September 23 - 25, 2022

Xiamen, China

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

2
Total Citations
View Citations
40
Total Downloads

Downloads (Last 12 months)13
Downloads (Last 6 weeks)2

Reflects downloads up to 01 Mar 2025

Other Metrics

View Author Metrics

Citations

Cited By

Mao XJiang JZeng YPeng YZhang SLi F(2024)Generative named entity recognition framework for Chinese legal domainPeerJ Computer Science10.7717/peerj-cs.242810(e2428)Online publication date: 4-Nov-2024
https://doi.org/10.7717/peerj-cs.2428
Li CQian YPan L(2023)Multimodal Features Enhanced Named Entity Recognition Based on Self-Attention Mechanism2023 8th International Conference on Data Science in Cyberspace (DSC)10.1109/DSC59305.2023.00018(52-59)Online publication date: 18-Aug-2023
https://doi.org/10.1109/DSC59305.2023.00018

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

HTML Format

View this article in HTML Format.

Figures

Tables

Media

View Table of Conten