ABSTRACT
Chinese word segmentation is an indispensable step in natural language processing, and it is also the most important step. At present, the use of recurrent neural network to Chinese word segmentation model has become a new trend. The researchers proposed various models based on the LSTM network model of long and short memory and the word segmentation method based on the GRU network model. Both LSTM and GRU are a type of circulatory neural network that inherits the ability to automatically learn and long-short term memory characteristics. However, in the process of Chinese word segmentation, as the length of the sentence becomes longer, the inter-dependent feature distance in the context becomes farther, resulting in the loss of the historical feature information and future feature information that the given sentence depends on, thereby reducing the accuracy of word segmentation. In order to solve this problem, this paper introduces the attention mechanism and proposes the BI_GRU_AT_HW_CRF_6 neural network segmentation model. Experiments show that with the introduction of attentional mechanism, with the change of sentence, there is a better performance in accuracy, training, and forecasting data speed.
- Biqing Li, Zhao Li (2018). The Design of Wireless Responder System Based on Radio Frequency Technology. Acta Electronica Malaysia, 2(1) 11--14.Google Scholar
- Biqing Li, Zhao Li (2018). The Implement of Wireless Responder System Based on Radio Frequency Technology. Acta Electronica Malaysia, 2(1) 15--17.Google Scholar
- Feng C, Li B. Chinese Words Segmentation Based on Double Hash Dictionary Running on Hadoop{C}// International Conference on Electromechanical Control Technology and Transportation. 2015.Google Scholar
- Yan Z, Sun Z, Zheng Y, et al. A High-Performance Key-Value Query Solution Based on Hash Dictionary and Trie Tree{C}// International Conference on Computer Science and Intelligent Communication. 2015.Google Scholar
- Dipta Tanaya, Mirna Adriani. Dictionary-based Word Segmentation for Javanese {J}. Procedia Computer Science, 2016, 81:208--213.Google ScholarCross Ref
- Tang J, Wu Q, Li Y. An Optimization Algorithm of Chinese Word Segmentation Based on Dictionary{C}// International Conference on Network and Information Systems for Computers. IEEE, 2015:259--262. Google ScholarDigital Library
- Du L, Li X, Liu C, et al. Chinese word segmentation based on conditional random fields with character clustering{C}// International Conference on Asian Language Processing. IEEE, 2017.Google Scholar
- Yang F, Vozila P. Semi-Supervised Chinese Word Segmentation Using Partial-Label Learning With Conditional Random Fields{C}// Conference on Empirical Methods in Natural Language Processing. 2014:90--98.Google Scholar
- Shuhao L I, Chen Y, Shubao L V, et al. Chinese word segmentation and k-best algorithm based on N-gram{J}. Intelligent Computer & Applications, 2016.Google Scholar
- Li Guolei Chen Xianlai Xia Dong Yang Rong,et al. Research on Segmentation of Chinese Text in Medical Record{J}. Chinese Journal of Biomedical Engineering, 2016, 35(4):477--481.Google Scholar
- Ma Li, Liu Xiao, Gong Yulong. Research of micro-blog short text sentiment orientation analysis based on semantic{J}. Application Research of Computers, 2016, 33(10):2914--2918.Google Scholar
- Cui Tong, Xu Xin. Big Data Video Annotation Based on Semantic Analysis{J}. Journal of Nanjing University of Aeronautics & Astronautics, 2016, 48(5):677--682.Google Scholar
- Chen X, Qiu X, Zhu C, et al. Long Short-Term Memory Neural Networks for Chinese Word Segmentation{C}// Conference on Empirical Methods in Natural Language Processing. 2015:1197--1206.Google Scholar
- Li X L, Duan H, Xu M. A gated recurrent unit neural network Ior Chinese word segmentation{J}. Journal of Xiamen University (Natural Science), 2017 56(2):237--243.Google Scholar
- Kim J, Lee J H. Multiple Range-Restricted Bidirectional Gated Recurrent Units with Attention for Relation Classification{J}. 2017.Google Scholar
- Aida Mustapha, Shazwani Mustapa, Nurfarahim Md.Azlan, Noor Fatin Ishma Saifarrudin, Shahreen Kasim, Mohd Farhan Md Fudzee, Azizul Azhar Ramli, Hairulnizam Mahdin, Seah Choon Sen (2017). A classification approach for naïve bayes of online retailers. Acta Informatica Malaysia, 1(1): 26--28.Google Scholar
- Kang Caiyu, Long Congjun, Jiang Yu. Segmentation of Tibetan Viscosity Forms Based on Vocabulary{J}. Computer Engineering and Applications, 2014, 50(11):218--222.Google Scholar
- He Xinyu, Li Lishuang. Trigger Detection Based on Bidirectional LSTM and Two-stage Method {J}. JOURNAL OF CHINESE INFORMATION PROCESSINU, 2017, 31(6).Google Scholar
- He Yon-qiang, Qin Qin, Wang Jun-peng. Embedded vector and topic model based on deep neural networks{J}. COMPUTER ENGINEERING AND DESIGN, 2016, 37(12):3384--3388Google Scholar
Index Terms
- Chinese word segmentation model based on BI_GRU_AT_HN_CRF_6
Recommendations
Chinese Word Segmentation Based on Deep Learning
ICMLC '18: Proceedings of the 2018 10th International Conference on Machine Learning and ComputingChinese word segmentation is a fundamental task in the field of Chinese Natural Language Processing. In this paper, we propose a series of neural network architectures by combining Long Short-Term Memory Neural Network (LSTM) with Conditional Random ...
Chinese Word Segmentation Based on Maximum Entropy
RSVT '19: Proceedings of the 2019 International Conference on Robotics Systems and Vehicle TechnologyChinese word segmentation has received extensive attention in recent years. The word segmentation method based on character-based tagging improves the performance of word segmentation greatly. This method transforms the word segmentation problem into a ...
Chinese word segmentation as morpheme-based lexical chunking
Chinese word segmentation plays an important role in many Chinese language processing tasks such as information retrieval and text mining. Recent research in Chinese word segmentation focuses on tagging approaches with either characters or words as ...
Comments