Abstract:
We explore morphology-based and sub-word language modeling approaches proposed for morphologically rich languages, and evaluate and contrast them for Turkish broadcast ne...Show MoreMetadata
Abstract:
We explore morphology-based and sub-word language modeling approaches proposed for morphologically rich languages, and evaluate and contrast them for Turkish broadcast news transcription task. In addition, as a morphology-based model, we improve our previously proposed morphology-integrated model for automatic speech recognition. This model is built by composing the finite-state transducer of the morphological parser with a language model over lexical morphemes. This approach provides a morphology-integrated search network with an unlimited vocabulary, generating only valid word forms while reducing the out-of-vocabulary rate and hence improving the word error rate. We also analyze the effect of morpho-tactics and morphological disambiguation on the speech recognition accuracy for the morphology-integrated model. The improved morphology-integrated model performs better than statistically derived sub-word models with added benefit of generating morpho-syntactic and semantic features.
Date of Conference: 14-19 March 2010
Date Added to IEEE Xplore: 28 June 2010
ISBN Information: