Abstract:
This paper presents a Chinese word segmentation system based on conditional random fields, which integrates the result information of N-gram model as features of conditio...Show MoreMetadata
Abstract:
This paper presents a Chinese word segmentation system based on conditional random fields, which integrates the result information of N-gram model as features of conditional random fields. Since dictionary-based N-gram model can deal with in-vocabulary words very well, while conditional random fields have the advantage of recognizing out-of-vocabulary words. This approach is evaluated using the PKU data from Sighan Bakeoff 2005. The experimental results have proven that this method achieved an F-measure of 95.0% and higher Roov (85.2%) and Riv (97.9%).
Date of Conference: 15-17 July 2012
Date Added to IEEE Xplore: 24 November 2012
ISBN Information: