skip to main content
10.1145/3358528.3358562acmotherconferencesArticle/Chapter ViewAbstractPublication PagesicbdtConference Proceedingsconference-collections
research-article

A Radical-Based Method for Chinese Named Entity Recognition

Authors Info & Claims
Published:28 August 2019Publication History

ABSTRACT

Chinese characters are composed of radicals, and their radicals have the distinction between "shaped parts" (representing semantics) and "sound parts" (representing speech). As a hieroglyph, many radicals of Chinese characters have certain semantic information, which can effectively improve the performance of Chinese named entity recognition. In the Chinese named entity recognition, many related studies use Bi-LSTM to extract the semantic features from radicals. However, the LSTM-based model cannot effectively extract the semantic information of radicals due to ambiguity in partitioning the granularity of radicals and weak dependency between Chinese radicals. Therefore, this paper presents a radical neural network method RCBC (Radical CNN-BiLSTM-CRF). The experimental results on SIGHAN 2006 Bakeoff MSRA dataset and Peking University's People's Daily dataset in 1998 indicate that this model can effectively extract the semantic information of Chinese radicals and improve the performanceof Chinese named entity recognition compared with the traditional model.

References

  1. Duan, H., & Zheng, Y. (2011). A study on features of the CRFs-based Chinese Named Entity Recognition. International Journal of Advanced Intelligence, 3(2), 287--294.Google ScholarGoogle Scholar
  2. Han, A. L. F., Wong, D. F., & Chao, L. S. (2013, June). Chinese named entity recognition with conditional random fields in the light of Chinese characteristics. In Intelligent Information Systems Symposium (pp. 57--68). Springer, Berlin, Heidelberg.Google ScholarGoogle Scholar
  3. Li, L., Mao, T., Huang, D., & Yang, Y. (2006). Hybrid models for Chinese named entity recognition. In Proceedings of the Fifth SIGHAN Workshop on Chinese Language Processing (pp. 72--78).Google ScholarGoogle Scholar
  4. Liu, L., Shang, J., Ren, X., Xu, F.F., Gui, H., Peng, J., & Han, J. (2018, April). Empower sequence labeling with task-aware neural language model. In Thirty-Second AAAI Conference on Artificial Intelligence.Google ScholarGoogle Scholar
  5. Peters, M. E., Ammar, W., Bhagavatula, C., & Power, R. (2017). Semi-supervised sequence tagging with bidirectional language models. arXiv preprint arXiv:1705.00108.Google ScholarGoogle Scholar
  6. Dong, C., Zhang, J., Zong, C., Hattori, M., & Di, H. (2016). Character-based LSTM-CRF with radical-level features for Chinese named entity recognition. In Natural Language Understanding and Intelligent Applications (pp. 239--250). Springer, Cham.Google ScholarGoogle ScholarCross RefCross Ref
  7. Wu Jinxing, Nasun-urtu, Yang Zhenxin. Recognition method of Mongolian person names based on conditional random fields [J]. Application Research of Computers, 2016,33(07): 2014--2017.Google ScholarGoogle Scholar
  8. Bai Bing, Hou Xia, Shi Song. Named entity recognition method based on CRF and BI-LSTM [J]. Journal of Beijing Information Science & Technology University,2018,33(06): 27--33.Google ScholarGoogle Scholar
  9. Santos, C. D., & Zadrozny, B. (2014). Learning character-level representations for part-of-speech tagging. In Proceedings of the 31st International Conference on Machine Learning (ICML-14) (pp. 1818--1826).Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Labeau, M., Löser, K., & Allauzen, A. (2015). Non-lexical neural architecture for fine-grained POS tagging. In Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing (pp. 232--237).Google ScholarGoogle ScholarCross RefCross Ref
  11. Chiu, J. P., & Nichols, E. (2016). Named entity recognition with bidirectional LSTM-CNNs. Transactions of the Association for Computational Linguistics, 4, 357--370..Google ScholarGoogle ScholarCross RefCross Ref
  12. Cao Chunping, Guan Pengju. Clinical text named entity recognition based on e-cnn and blstm-crf [J/OL]. Application Research of Computers, 1--5[2019-02-13]. https://doi.org/10.19734/j.issn.1001-3695.2018.09.0606.Google ScholarGoogle Scholar
  13. Cao, S., Lu, W., Zhou, J., & Li, X. (2018, April). cw2vec: Learning chinese word embeddings with stroke n-gram information. In Thirty-Second AAAI Conference on Artificial Intelligence.Google ScholarGoogle Scholar

Index Terms

  1. A Radical-Based Method for Chinese Named Entity Recognition

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Other conferences
      ICBDT '19: Proceedings of the 2nd International Conference on Big Data Technologies
      August 2019
      382 pages
      ISBN:9781450371926
      DOI:10.1145/3358528

      Copyright © 2019 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 28 August 2019

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article
      • Research
      • Refereed limited

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader