skip to main content
10.1145/3232116.3232131acmotherconferencesArticle/Chapter ViewAbstractPublication PagesiciipConference Proceedingsconference-collections
research-article

Chinese word segmentation model based on BI_GRU_AT_HN_CRF_6

Authors Info & Claims
Published:19 May 2018Publication History

ABSTRACT

Chinese word segmentation is an indispensable step in natural language processing, and it is also the most important step. At present, the use of recurrent neural network to Chinese word segmentation model has become a new trend. The researchers proposed various models based on the LSTM network model of long and short memory and the word segmentation method based on the GRU network model. Both LSTM and GRU are a type of circulatory neural network that inherits the ability to automatically learn and long-short term memory characteristics. However, in the process of Chinese word segmentation, as the length of the sentence becomes longer, the inter-dependent feature distance in the context becomes farther, resulting in the loss of the historical feature information and future feature information that the given sentence depends on, thereby reducing the accuracy of word segmentation. In order to solve this problem, this paper introduces the attention mechanism and proposes the BI_GRU_AT_HW_CRF_6 neural network segmentation model. Experiments show that with the introduction of attentional mechanism, with the change of sentence, there is a better performance in accuracy, training, and forecasting data speed.

References

  1. Biqing Li, Zhao Li (2018). The Design of Wireless Responder System Based on Radio Frequency Technology. Acta Electronica Malaysia, 2(1) 11--14.Google ScholarGoogle Scholar
  2. Biqing Li, Zhao Li (2018). The Implement of Wireless Responder System Based on Radio Frequency Technology. Acta Electronica Malaysia, 2(1) 15--17.Google ScholarGoogle Scholar
  3. Feng C, Li B. Chinese Words Segmentation Based on Double Hash Dictionary Running on Hadoop{C}// International Conference on Electromechanical Control Technology and Transportation. 2015.Google ScholarGoogle Scholar
  4. Yan Z, Sun Z, Zheng Y, et al. A High-Performance Key-Value Query Solution Based on Hash Dictionary and Trie Tree{C}// International Conference on Computer Science and Intelligent Communication. 2015.Google ScholarGoogle Scholar
  5. Dipta Tanaya, Mirna Adriani. Dictionary-based Word Segmentation for Javanese {J}. Procedia Computer Science, 2016, 81:208--213.Google ScholarGoogle ScholarCross RefCross Ref
  6. Tang J, Wu Q, Li Y. An Optimization Algorithm of Chinese Word Segmentation Based on Dictionary{C}// International Conference on Network and Information Systems for Computers. IEEE, 2015:259--262. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Du L, Li X, Liu C, et al. Chinese word segmentation based on conditional random fields with character clustering{C}// International Conference on Asian Language Processing. IEEE, 2017.Google ScholarGoogle Scholar
  8. Yang F, Vozila P. Semi-Supervised Chinese Word Segmentation Using Partial-Label Learning With Conditional Random Fields{C}// Conference on Empirical Methods in Natural Language Processing. 2014:90--98.Google ScholarGoogle Scholar
  9. Shuhao L I, Chen Y, Shubao L V, et al. Chinese word segmentation and k-best algorithm based on N-gram{J}. Intelligent Computer & Applications, 2016.Google ScholarGoogle Scholar
  10. Li Guolei Chen Xianlai Xia Dong Yang Rong,et al. Research on Segmentation of Chinese Text in Medical Record{J}. Chinese Journal of Biomedical Engineering, 2016, 35(4):477--481.Google ScholarGoogle Scholar
  11. Ma Li, Liu Xiao, Gong Yulong. Research of micro-blog short text sentiment orientation analysis based on semantic{J}. Application Research of Computers, 2016, 33(10):2914--2918.Google ScholarGoogle Scholar
  12. Cui Tong, Xu Xin. Big Data Video Annotation Based on Semantic Analysis{J}. Journal of Nanjing University of Aeronautics & Astronautics, 2016, 48(5):677--682.Google ScholarGoogle Scholar
  13. Chen X, Qiu X, Zhu C, et al. Long Short-Term Memory Neural Networks for Chinese Word Segmentation{C}// Conference on Empirical Methods in Natural Language Processing. 2015:1197--1206.Google ScholarGoogle Scholar
  14. Li X L, Duan H, Xu M. A gated recurrent unit neural network Ior Chinese word segmentation{J}. Journal of Xiamen University (Natural Science), 2017 56(2):237--243.Google ScholarGoogle Scholar
  15. Kim J, Lee J H. Multiple Range-Restricted Bidirectional Gated Recurrent Units with Attention for Relation Classification{J}. 2017.Google ScholarGoogle Scholar
  16. Aida Mustapha, Shazwani Mustapa, Nurfarahim Md.Azlan, Noor Fatin Ishma Saifarrudin, Shahreen Kasim, Mohd Farhan Md Fudzee, Azizul Azhar Ramli, Hairulnizam Mahdin, Seah Choon Sen (2017). A classification approach for naïve bayes of online retailers. Acta Informatica Malaysia, 1(1): 26--28.Google ScholarGoogle Scholar
  17. Kang Caiyu, Long Congjun, Jiang Yu. Segmentation of Tibetan Viscosity Forms Based on Vocabulary{J}. Computer Engineering and Applications, 2014, 50(11):218--222.Google ScholarGoogle Scholar
  18. He Xinyu, Li Lishuang. Trigger Detection Based on Bidirectional LSTM and Two-stage Method {J}. JOURNAL OF CHINESE INFORMATION PROCESSINU, 2017, 31(6).Google ScholarGoogle Scholar
  19. He Yon-qiang, Qin Qin, Wang Jun-peng. Embedded vector and topic model based on deep neural networks{J}. COMPUTER ENGINEERING AND DESIGN, 2016, 37(12):3384--3388Google ScholarGoogle Scholar

Index Terms

  1. Chinese word segmentation model based on BI_GRU_AT_HN_CRF_6

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Other conferences
      ICIIP '18: Proceedings of the 3rd International Conference on Intelligent Information Processing
      May 2018
      249 pages
      ISBN:9781450364966
      DOI:10.1145/3232116

      Copyright © 2018 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 19 May 2018

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article
      • Research
      • Refereed limited

      Acceptance Rates

      Overall Acceptance Rate87of367submissions,24%
    • Article Metrics

      • Downloads (Last 12 months)5
      • Downloads (Last 6 weeks)2

      Other Metrics

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader