IEICE Trans - Morpheme-Based Modeling of Pronunciation Variation for Large Vocabulary Continuous Speech Recognition in Korean

Morpheme-Based Modeling of Pronunciation Variation for Large Vocabulary Continuous Speech Recognition in Korean

Kyong-Nim LEE
Minhwa CHUNG

Publication
IEICE TRANSACTIONS on Information and Systems Vol.E90-D No.7 pp.1063-1072
Publication Date: 2007/07/01
Online ISSN: 1745-1361
DOI: 10.1093/ietisy/e90-d.7.1063
Print ISSN: 0916-8532
Type of Manuscript: PAPER
Category: Speech and Hearing
Keyword:
pronunciation variation modeling, pronunciation lexicon, grapheme-to-phoneme conversion, phonological rule, Korean LVCSR,

Full Text: PDF(214.1KB)>>

Summary:
This paper describes a morpheme-based pronunciation model that is especially useful to develop the pronunciation lexicon for Large Vocabulary Continuous Speech Recognition (LVCSR) in Korean. To address pronunciation variation in Korean, we analyze phonological rules based on phonemic contexts together with morphological category and morpheme boundary information. Since the same phoneme sequences can be pronounced in different ways at across morpheme boundary, incorporating morphological environment is required to manipulate pronunciation variation modeling. We implement a rule-based pronunciation variants generator to produce a pronunciation lexicon with context-dependent multiple variants. At the lexical level, we apply an explicit modeling of pronunciation variation to add pronunciation variants at across morphemes as well as within morpheme into the pronunciation lexicon. At the acoustic level, we train the phone models with re-labeled transcriptions through forced alignment using context-dependent pronunciation lexicon. The proposed pronunciation lexicon offers the potential benefit for both training and decoding of a LVCSR system. Subsequently, we perform the speech recognition experiment on read speech task with 34K-morpheme vocabulary. Experiment confirms that improved performance is achieved by pronunciation variation modeling based on morpho-phonological analysis.

open access publishing via