article

Morpheme-based grapheme to phoneme conversion using phonetic patterns and morphophonemic connectivity information

Authors:
Byeongchang Kim

Uiduk University, Kangdong, Kyongju, South Korea

Uiduk University, Kangdong, Kyongju, South Korea
View Profile

,
Gary Geunbae Lee

Pohang University of Science and Technology and Uiduk University, Kangdong, Kyongju, South Korea

Pohang University of Science and Technology and Uiduk University, Kangdong, Kyongju, South Korea
View Profile

,
Jong-Hyeok Lee

Pohang University of Science and Technology and Uiduk University, Kangdong, Kyongju, South Korea

Pohang University of Science and Technology and Uiduk University, Kangdong, Kyongju, South Korea
View Profile

Authors Info & Claims

ACM Transactions on Asian Language Information Processing Volume 1 Issue 1pp 65–82https://doi.org/10.1145/595576.595580

Published:01 March 2002Publication History

ACM Transactions on Asian Language Information Processing

Abstract

Both dictionary-based and rule-based methods on grapheme-to-phoneme conversion have their own advantages and limitations. For example, a large sized phonetic dictionary and complex morphophonemic rules are required for the dictionary-based method and the LTS (letter to sound) rule-based method itself cannot model the complete morphophonemic constraints.This paper describes a grapheme-to-phoneme conversion method for Korean using a dictionary-based and rule-based hybrid method with a phonetic pattern dictionary and CCV (consonant consonant vowel) LTS (letter to sound) rules. The phonetic pattern dictionary, standing for the dictionary-based method, contains entries in the form of a morpheme pattern and its phonetic pattern. The patterns represent candidate phonological changes in left and right boundaries of morphemes. Obviously, the CCV LTS rules stand for the rule-based method. The rules are in charge of grapheme-to-phoneme conversion within morphemes.The conversion method consists of mainly two steps including morpheme to phoneme conversion and morphophonemic connectivity check, and two preprocessing steps including phrase break prediction and morpheme normalization. Phrase break prediction presumes phrase breaks using the stochastic method on part-of-speech (POS) information. Morpheme normalization is to replace non-Korean symbols with their corresponding standard Korean graphemes. In the morpheme-phoneticizing module, each morpheme in the phrase is converted into phonetic patterns by looking it up in the phonetic pattern dictionary. Graphemes within a morpheme are grouped into CCV units and converted into phonemes by the CCV LTS rules. The morphophonemic connectivity table supports grammaticality checking of the two adjacent phonetic morphemes.In experiments with a non-Korean symbol free corpus of 4,973 sentences, we achieved a 99.98% grapheme-to-phoneme conversion performance rate and a 99.0% sentence conversion performance rate. With a broadcast news corpus of 621 sentences, 99.7% of the graphemes and 86.6% of the sentences are correctly converted. The full Korean TTS (Text-to-Speech) system is now being implemented using this conversion method.

References

Allen, J., and Hunnicut, S. 1987. From Text to Speech: the MITalk System. Cambridge University Press. Google Scholar
Bagshaw, P. C. 1998. Phonemic transcription by analogy in text-to-speech synthesis: Novel word pronunciation and lexicon compression. Computational Linguistics 12(2), 119-142.Google Scholar
Bechet, F., and El-Beze, M. 1997. Automatic assignment of part-of-speech to out-of-vocabulary words for text-to-speech processing. In Proceedings of the EUROSPEECH '97, 983-986.Google Scholar
Cha, J., Lee, G., and Lee, J. 1998. Generalized unknown morpheme guessing for hybrid POS tagging of Korean. In Proceedings of the Sixth Workshop on Very Large Corpora, 85-93.Google Scholar
Cha, S., and Chung, M. 1998. Automatic generation of Korean pronunciation variants for TTS system. In Proceedigns of the 10th Workshop on Speech Communication and Signal Processing (in Korean).Google Scholar
Charniak, E. 1994. Statistical language learning. MIT press. Google Scholar
Daelemans, W. M. P., and van den Bosch, A. P. J. 1997. Language-independent data-oriented grapheme-to-phoneme conversion. In Progress in Speech Synthesis, J. P. van Santen, R. W. Sproat, J. P. Olive, and J. Hirschberg, Eds. Springer-Verlag.Google Scholar
Divay, M., and Vitale, A. J. 1997. Algorithms for grapheme-phoneme translation for English and French: Applications. Computational Linguistics 23(4), 495-523. Google Scholar
Dutoit, T. 1997. An introduction to Text-to-Speech synthesis. Kluwer Academic Publishers,. Google Scholar
Jeon, J., Wee, S., and Chung, M. 1997. Generating pronunciation dictionary by analyzing phonological variations frequently found in spoken Korean. In Proceedigns of the Internation Conference on Speech Processing, 519-524Google Scholar
Korea Ministry of Education. 1995. Korean Standard Rule Collections. Taehan Publishers (in Korean).Google Scholar
Lee, G., Cha, J., and Lee, J. 1997. Hybrid POS tagging with generalized unknown-word handling. In Proceedings of the IRAL '97, 43-50.Google Scholar
Lee, S., and Oh, Y. 1996. A text analyzer for Korean text-to-speech systems. In Proceedings of the international conference on spoken language processing (ICSLP), 1692-1695.Google Scholar
Park, S., and Kwon, H. 1995. Implementation to phonological alteration module for a Korean text-to-speech. In Proceedings of the 7th conference on Korean and Korean information processing (in Korean), 35-38.Google Scholar
Sanders, E. 1995. Using probabilistic methods to predict phrase boundaries for a text-to-speech system. Master's thesis, University of Nijmegen.Google Scholar
Taylor, P., and Black, A. W. 1998. Assigning phrase breaks from part-of-speech sequences. Computer Speech and Language 12(2), 99-117.Google Scholar
Santen, J. P. van, Sproat, R. W., Olive, J. P., and Hirschberg, J. 1997. Progress in Speech Synthesis. Springer-Verlag. Google Scholar
Wee, S., and Chung, M. 1997. Generating phonetic dictionary using phonological rules. In Proceedings of the HCI '97 conference (in Korean), 308-313.Google Scholar

Index Terms

Morpheme-based grapheme to phoneme conversion using phonetic patterns and morphophonemic connectivity information

Recommendations

Morpheme-Based Modeling of Pronunciation Variation for Large Vocabulary Continuous Speech Recognition in Korean

This paper describes a morpheme-based pronunciation model that is especially useful to develop the pronunciation lexicon for Large Vocabulary Continuous Speech Recognition (LVCSR) in Korean. To address pronunciation variation in Korean, we analyze ...
Read More
Unlimited vocabulary grapheme to phoneme conversion for Korean TTS
ACL '98/COLING '98: Proceedings of the 36th Annual Meeting of the Association for Computational Linguistics and 17th International Conference on Computational Linguistics - Volume 1

This paper describes a grapheme-to-phoneme conversion method using phoneme connectivity and CCV conversion rules. The method consists of mainly four modules including morpheme normalization, phrase-break detection, morpheme to phoneme conversion and ...
Read More
Boosting Rule-Based Grapheme-to-Phoneme Conversion with Morphological Segmentation and Syllabification in Bengali
Speech and Computer
Abstract
This paper presents a novel approach to enhance rule-based Bengali grapheme-to-phoneme (G2P) conversion by leveraging morphological segmentation and syllabification techniques. In this approach, input words are first morphologically segmented into ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

Published in

ACM Transactions on Asian Language Information Processing Volume 1, Issue 1
March 2002
102 pages
ISSN:1530-0226
EISSN:1558-3430
DOI:10.1145/595576
Issue’s Table of Contents

Copyright © 2002 ACM
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 1 March 2002
Published in talip Volume 1, Issue 1

Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
CCV LTS rule
grapheme-to-phoneme conversion
morphophonemic modeling
phonetic pattern dictionary
text-to-speech system
Qualifiers
- article
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 12
  Total Citations
  View Citations
- 561
  Total Downloads
- Downloads (Last 12 months)8
- Downloads (Last 6 weeks)0
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Morpheme-based grapheme to phoneme conversion using phonetic patterns and morphophonemic connectivity information

ACM Transactions on Asian Language Information Processing

Abstract

References

Cited By

Index Terms

Recommendations

Morpheme-Based Modeling of Pronunciation Variation for Large Vocabulary Continuous Speech Recognition in Korean

Unlimited vocabulary grapheme to phoneme conversion for Korean TTS

Boosting Rule-Based Grapheme-to-Phoneme Conversion with Morphological Segmentation and Syllabification in Bengali

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

Morpheme-based grapheme to phoneme conversion using phonetic patterns and morphophonemic connectivity information

ACM Transactions on Asian Language Information Processing

Abstract

References

Cited By

Index Terms

Recommendations

Morpheme-Based Modeling of Pronunciation Variation for Large Vocabulary Continuous Speech Recognition in Korean

Unlimited vocabulary grapheme to phoneme conversion for Korean TTS

Boosting Rule-Based Grapheme-to-Phoneme Conversion with Morphological Segmentation and Syllabification in Bengali

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media