Loading [a11y]/accessibility-menu.js
Investigating Self-Attention Network for Chinese Word Segmentation | IEEE Journals & Magazine | IEEE Xplore

Investigating Self-Attention Network for Chinese Word Segmentation


Abstract:

Neural network has become the dominant method for Chinese word segmentation. Most existing models cast the task as sequence labeling, using BiLSTM-CRF for representing th...Show More

Abstract:

Neural network has become the dominant method for Chinese word segmentation. Most existing models cast the task as sequence labeling, using BiLSTM-CRF for representing the input, and making output predictions. Recently, attention-based sequence models have emerged as a highly competitive alternative to LSTMs, which allow better running speed by parallelization of computation. We investigate self-attention network (SAN) for Chinese word segmentation, making comparisons between BiLSTM-CRF models. In addition, the influence of contextualized character embeddings is investigated using BERT, and a method is proposed for integrating word information into SAN segmentation. Results show that SAN gives highly competitive results compared with BiLSTMs, with BERT, and word information further improving segmentation for in-domain, and cross-domain segmentation. Our final models give the best results for 6 heterogenous domain benchmarks.
Page(s): 2933 - 2941
Date of Publication: 13 October 2020

ISSN Information:

Funding Agency:


Contact IEEE to Subscribe

References

References is not available for this document.