skip to main content
10.1145/3529466.3529478acmotherconferencesArticle/Chapter ViewAbstractPublication PagesiciaiConference Proceedingsconference-collections
research-article

Named Entity Recognition of Zhuang Language Based on the Feature of Initial Letter in Word

Authors Info & Claims
Published:04 June 2022Publication History

ABSTRACT

Named entity recognition is an important task and basis for the intelligent information processing and knowledge representation learning of the Zhuang Language. A BilSTM-CNN-CRF network model combining the uppercase and lowercase characters of words is proposed to be applied to the named entity recognition task of the Zhuang language, which lacks corpus for named entity labeling. Firstly, word2vec is used to train in unmarked Zhuang text to get the word vector of the Zhuang language. Then convolutional neural network is used to extract the character features of Zhuang words, and the character feature vector is obtained. The above two vectors were connected with the initial case feature vectors, which are randomly generated, and then the connected vectors were input into a BilSTM-CNN-CRF model for training; thus, the end-to-end named entity recognition model of Zhuang language was constructed. Experimental results show that, without relying on artificial features and external dictionaries, the proposed method in this study is superior to contrastive models by achieving an 80.37% F1 value in the named entity recognition task, which leads to the realization of automated named entity recognition of Zhuang language.

References

  1. Yue W, Mengxuan W, Sheng Z Named Entity Recognition of Warning Text Based on BERT [J]. Computer Application,2020,40(02):535-540.Google ScholarGoogle Scholar
  2. Mengcheng M, Qingwen Y, Amutula E, etc. Chinese Named Entity Classification Based on Word Vector and Conditional Random Fields [J]. Computer Engineering and Design,2020,41(09):2515-2522.Google ScholarGoogle Scholar
  3. Huang Z, Xu W, Yu K. Bidirectional LSTM-CRF models for sequence tagging. arXiv preprint arXiv:1508.01991, 2015.Google ScholarGoogle Scholar
  4. Ma X, Hovy E. End-to-end sequence labeling via bi-directional lstm-cnns-crf. arXiv preprint arXiv:1603.01354, 2016.Google ScholarGoogle Scholar
  5. Lishuang L, Yuankai G. Biomedical named entity recognition based on CNN-BLSTM-CRF model. Chinese Journal of Information, 2018, 1: 18-23.Google ScholarGoogle Scholar
  6. Tang Suqin, Sun Yaru, Li Zhixin Part of speech tagging of Zhuang language based on reinforcement learning. Computer Engineering,2020,46(04):309-315.Google ScholarGoogle Scholar
  7. Maimaitiayifu, SILAMU Wushouer, MUHETAER Palidan, Uyghur named entity recognition based on BiLSTM-CNN-CRF model.Computer Engineering, 2018, 44(8):230-236.Google ScholarGoogle Scholar
  8. Yang J, Liang S, Zhang Y. Design challenges and misconceptions in neural sequence labeling. arXiv preprint arXiv:1806.04470, 2018.Google ScholarGoogle Scholar
  9. Chiu J P C, Nichols E. Named entity recognition with bidirectional LSTM-CNNs. Transactions of the Association for Computational Linguistics, 2016, 4: 357-370.Google ScholarGoogle ScholarCross RefCross Ref
  10. Hochreiter S, Schmidhuber J. Long short-term memory[J]. Neural computation, 1997, 9(8): 1735-1780.Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Graves A, Jaitly N, Mohamed A. Hybrid speech recognition with deep bidirectional LSTM. 2013 IEEE workshop on automatic speech recognition and understanding. IEEE, 2013: 273-278.Google ScholarGoogle Scholar
  12. Graves A. Generating sequences with recurrent neural networks. arXiv preprint arXiv:1308.0850, 2013.Google ScholarGoogle Scholar
  13. Lample G, Ballesteros M, Subramanian S, Neural architectures for named entity recognition. arXiv preprint arXiv:1603.01360, 2016.Google ScholarGoogle Scholar
  14. Ratinov L, Roth D. Design challenges and misconceptions in named entity recognition. In Proceedings of the Thirteenth Conference on Computational Natural Language Learning (CoNLL-2009). 2009: 147-155.Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Dai H J, Lai P T, Chang Y C, Enhancing of chemical compound and drug name recognition using representative tag scheme and fine-grained tokenization. Journal of cheminformatics, 2015, 7(1): 1-10.Google ScholarGoogle Scholar
  16. Dandan C,Xiulei L, Ruoyu C Lattice LSTM based Named Entity Recognition in Ancient Chinese [J]. Computer Science,2020,47(S2):18-22.Google ScholarGoogle Scholar
  17. Srivastava N, Hinton G, Krizhevsky A, Dropout: a simple way to prevent neural networks from overfitting. The journal of machine learning research, 2014, 15(1): 1929-1958.Google ScholarGoogle Scholar
  18. Mikolov T, Chen K, Corrado G, Efficient estimation of word representations in vector space[J]. arXiv preprint arXiv:1301.3781, 2013Google ScholarGoogle Scholar

Recommendations

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Sign in
  • Published in

    cover image ACM Other conferences
    ICIAI '22: Proceedings of the 2022 6th International Conference on Innovation in Artificial Intelligence
    March 2022
    240 pages
    ISBN:9781450395502
    DOI:10.1145/3529466

    Copyright © 2022 ACM

    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    • Published: 4 June 2022

    Permissions

    Request permissions about this article.

    Request Permissions

    Check for updates

    Qualifiers

    • research-article
    • Research
    • Refereed limited
  • Article Metrics

    • Downloads (Last 12 months)12
    • Downloads (Last 6 weeks)1

    Other Metrics

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format .

View HTML Format