Morphology-based and sub-word language modeling for Turkish speech recognition | IEEE Conference Publication | IEEE Xplore

Morphology-based and sub-word language modeling for Turkish speech recognition


Abstract:

We explore morphology-based and sub-word language modeling approaches proposed for morphologically rich languages, and evaluate and contrast them for Turkish broadcast ne...Show More

Abstract:

We explore morphology-based and sub-word language modeling approaches proposed for morphologically rich languages, and evaluate and contrast them for Turkish broadcast news transcription task. In addition, as a morphology-based model, we improve our previously proposed morphology-integrated model for automatic speech recognition. This model is built by composing the finite-state transducer of the morphological parser with a language model over lexical morphemes. This approach provides a morphology-integrated search network with an unlimited vocabulary, generating only valid word forms while reducing the out-of-vocabulary rate and hence improving the word error rate. We also analyze the effect of morpho-tactics and morphological disambiguation on the speech recognition accuracy for the morphology-integrated model. The improved morphology-integrated model performs better than statistically derived sub-word models with added benefit of generating morpho-syntactic and semantic features.
Date of Conference: 14-19 March 2010
Date Added to IEEE Xplore: 28 June 2010
ISBN Information:

ISSN Information:

Conference Location: Dallas, TX, USA

Contact IEEE to Subscribe

References

References is not available for this document.