Learning Chinese Word Segmentation Based on Bidirectional GRU-CRF and CNN Network Model

Chenghai Yu, Shupei Wang, Jiajun Guo

Source Title: International Journal of Technology and Human Interaction (IJTHI)15(3)

ISSN: 1548-3908|EISSN: 1548-3916|EISBN13: 9781522564164|DOI: 10.4018/IJTHI.2019070104

MLA

Yu, Chenghai, et al. "Learning Chinese Word Segmentation Based on Bidirectional GRU-CRF and CNN Network Model." IJTHI vol.15, no.3 2019: pp.47-62. http://doi.org/10.4018/IJTHI.2019070104

APA

Yu, C., Wang, S., & Guo, J. (2019). Learning Chinese Word Segmentation Based on Bidirectional GRU-CRF and CNN Network Model. International Journal of Technology and Human Interaction (IJTHI), 15(3), 47-62. http://doi.org/10.4018/IJTHI.2019070104

Chicago

Yu, Chenghai, Shupei Wang, and Jiajun Guo. "Learning Chinese Word Segmentation Based on Bidirectional GRU-CRF and CNN Network Model," International Journal of Technology and Human Interaction (IJTHI) 15, no.3: 47-62. http://doi.org/10.4018/IJTHI.2019070104

Export Reference

Favorite Full-Issue Download

View Full Text HTML

View Full Text PDF

Abstract

Chinese word segmentation is the basis of the Chinese natural language processing (NLP). With the development of the deep learning, various neural network models are applied to the Chinese word segmentation. However, current neural network models have the characteristics of artificial feature extraction, nonstandard word-weight, inability to effectively use long-distance information and long training time of models in Chinese word segmentation. To solve a series of problems, this article presents a CNN-Bidirectional GRU-CRF neural network model (CNN Bidirectional GRU CRF Network, CBiGCN), which breaks through the limit of conventional method window, truly realizes end-to-end processing and applies to the neural network model by the five-Tag set method, bias-variable-weight greedy strategy and supplements by Goldstein-Armijo guidelines. Besides, this model, with simple structure, is easy to be operated. And it can automatically learn features, reduces large amounts of tasks on specific knowledge in the form of handcrafted features and data pre-processing, makes use of context information effectively. The authors set an experiment with two data corpuses for Chinese word segmentation to evaluate their system. The experiment verified their new model can obtain better Chinese word segmentation results and greatly reduce training time.

You do not own this content. Please login to recommend this title to your institution's librarian or purchase it from the IGI Global bookstore.

Username or email: *

Password: *

Forgot individual login password?

Create individual account

Learning Chinese Word Segmentation Based on Bidirectional GRU-CRF and CNN Network Model

MLA

APA

Chicago

Export Reference

Abstract

Request Access