Abstract
Multiple-pronunciation dictionaries are often used by automatic speech recognition systems in order to account for different speaking styles. In this paper, two methods based on statistical machine translation (SMT) are used to generate multiple pronunciations from the canonical pronunciation of a word. In the first method, a machine translation tool is used to perform phoneme-to-phoneme (p2p) conversion and derive variants from a given canonical pronunciation. The second method is based on a pivot method proposed for the paraphrase extraction task. The two methods are compared under different training conditions which allow single or multiple pronunciations in the training set, and their performance is evaluated in terms of recall and precision measures.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Adda-Decker, M., Lamel, L.: Pronunciation variants across system configuration, language and speaking style. Speech Communication 29, 83–98 (1999)
Bannard, C., Callison-Burch, C.: Paraphrasing with bilingual parallel corpora. In: Proc. of ACL, pp. 597–604 (2005)
Divay, M., Vitale, A.-J.: Algorithms for grapheme-phoneme translation for English and French: Applications for database searches and speech synthesis. Computational linguistics 23(4), 495–523 (1997)
Fukada, T., Yoshimura, T., Sagisaka, Y.: Automatic generation of multiple pronunciations based on neural networks. Speech Communication 27(1), 63–73 (1999)
Gerosa, M., Federico, M.: Coping with out-of-vocabulary words: open versus huge vocabulary ASR. In: ICASSP, pp. 4313–4316 (2009)
Heuvel, H., van de Reveil B., Martens, J.-P.: Pronunciation-based ASR for names. In: Proc. of Interspeech, pp. 2991–2994 (2009)
Koehn, P., et al.: Moses: Open source toolkit for statistical machine translation. In: Proc. of ACL (2007)
Lamel, L., Adda, G.: On designing pronunciation lexicons for large vocabulary, continuous speech recognition. In: Proc. ICSLP 1996, pp. 6–9 (1996)
Laurent, A., Deleglise, P., Meignier, S.: Grapheme to phoneme conversion using a SMT system. In: Proc. of Interspeech (2009)
Papineni, K., Roukos, S., Ward, T., Zhu, W.J.: Bleu: a method for automatic evaluation of machine translation. In: Proc. of ACL, pp. 311–318 (2002)
Quirk, C., Brockett, C., Dolan, W.: Monolingual Machine Translation for Paraphrase Generation. In: Proc. of EMNLP, pp. 142–149 (2004)
Spiegel, M.F.: Using the ORATOR synthesizer for a public reverse-directory service: design, lessons, and recommendations. In: EUROSPEECH 1993, pp. 1897–1900 (1993)
Stolcke, A.: SRILM-An extensible language modeling toolkit. In: Proc. ICSLP 2002, vol. 2, pp. 901–904 (2002)
Tsai, M.-Y., Chou, F.-C., Lee, L.-S.: Pronunciation modeling with reduced confusion for mandarin chinese using a three-stage framework. IEEE Transactions on Audio, Speech and Language Processing 15(2), 661–675 (2007)
Van Rijsbergen, C.J.: Information Retrieval. Butterworths, London (1979)
Weintraub, M., Fosler, E., Galles, C., Kao, Y.-H., Khudanpur, S., Saraclar, M., Wegmann, S.: WS96 project report:Automatic learning of word pronunciation from data. In: JHU Workshop Pronunciation Group (1996)
Wester, M.: Pronunciation modeling for ASR- Knowledge-based and data-driven methods. Comput. Speech Lang., 69–85 (2003)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2010 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Karanasou, P., Lamel, L. (2010). Comparing SMT Methods for Automatic Generation of Pronunciation Variants. In: Loftsson, H., Rögnvaldsson, E., Helgadóttir, S. (eds) Advances in Natural Language Processing. NLP 2010. Lecture Notes in Computer Science(), vol 6233. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-14770-8_20
Download citation
DOI: https://doi.org/10.1007/978-3-642-14770-8_20
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-14769-2
Online ISBN: 978-3-642-14770-8
eBook Packages: Computer ScienceComputer Science (R0)