Loading [a11y]/accessibility-menu.js
Affix-augmented stem-based language model for persian | IEEE Conference Publication | IEEE Xplore

Affix-augmented stem-based language model for persian


Abstract:

Language modeling is used in many NLP applications like machine translation, POS tagging, speech recognition and information retrieval. It assigns a probability to a sequ...Show More

Abstract:

Language modeling is used in many NLP applications like machine translation, POS tagging, speech recognition and information retrieval. It assigns a probability to a sequence of words. This task becomes a challenging problem for high inflectional languages. In this paper we investigate standard statistical language models on the Persian as an inflectional language. We propose two variations of morphological language models that rely on a morphological analyzer to manipulate the dataset before modeling. Then we discuss shortcoming of these models, and introduce a novel approach that exploits the structure of the language and produces more accurate. Experimental results are encouraging especially when we use n-gram models with small training dataset.
Date of Conference: 21-23 August 2010
Date Added to IEEE Xplore: 30 September 2010
ISBN Information:
Conference Location: Beijing, China

Contact IEEE to Subscribe

References

References is not available for this document.