Comparing SMT Methods for Automatic Generation of Pronunciation Variants

Karanasou, Panagiota; Lamel, Lori

doi:10.1007/978-3-642-14770-8_20

Panagiota Karanasou²² &
Lori Lamel²²

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 6233))

Included in the following conference series:

International Conference on Natural Language Processing

1223 Accesses
4 Citations
3 Altmetric

Abstract

Multiple-pronunciation dictionaries are often used by automatic speech recognition systems in order to account for different speaking styles. In this paper, two methods based on statistical machine translation (SMT) are used to generate multiple pronunciations from the canonical pronunciation of a word. In the first method, a machine translation tool is used to perform phoneme-to-phoneme (p2p) conversion and derive variants from a given canonical pronunciation. The second method is based on a pivot method proposed for the paraphrase extraction task. The two methods are compared under different training conditions which allow single or multiple pronunciations in the training set, and their performance is evaluated in terms of recall and precision measures.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Statistical Pronunciation Adaptation for Spontaneous Speech Synthesis

Probabilistic Speaker Pronunciation Adaptation for Spontaneous Speech Synthesis Using Linguistic Features

A Hybrid Approach to Statistical Machine Translation Between Standard and Dialectal Varieties

References

Adda-Decker, M., Lamel, L.: Pronunciation variants across system configuration, language and speaking style. Speech Communication 29, 83–98 (1999)
Article Google Scholar
Bannard, C., Callison-Burch, C.: Paraphrasing with bilingual parallel corpora. In: Proc. of ACL, pp. 597–604 (2005)
Google Scholar
Divay, M., Vitale, A.-J.: Algorithms for grapheme-phoneme translation for English and French: Applications for database searches and speech synthesis. Computational linguistics 23(4), 495–523 (1997)
Google Scholar
Fukada, T., Yoshimura, T., Sagisaka, Y.: Automatic generation of multiple pronunciations based on neural networks. Speech Communication 27(1), 63–73 (1999)
Article Google Scholar
Gerosa, M., Federico, M.: Coping with out-of-vocabulary words: open versus huge vocabulary ASR. In: ICASSP, pp. 4313–4316 (2009)
Google Scholar
Heuvel, H., van de Reveil B., Martens, J.-P.: Pronunciation-based ASR for names. In: Proc. of Interspeech, pp. 2991–2994 (2009)
Google Scholar
Koehn, P., et al.: Moses: Open source toolkit for statistical machine translation. In: Proc. of ACL (2007)
Google Scholar
Lamel, L., Adda, G.: On designing pronunciation lexicons for large vocabulary, continuous speech recognition. In: Proc. ICSLP 1996, pp. 6–9 (1996)
Google Scholar
Laurent, A., Deleglise, P., Meignier, S.: Grapheme to phoneme conversion using a SMT system. In: Proc. of Interspeech (2009)
Google Scholar
Papineni, K., Roukos, S., Ward, T., Zhu, W.J.: Bleu: a method for automatic evaluation of machine translation. In: Proc. of ACL, pp. 311–318 (2002)
Google Scholar
Quirk, C., Brockett, C., Dolan, W.: Monolingual Machine Translation for Paraphrase Generation. In: Proc. of EMNLP, pp. 142–149 (2004)
Google Scholar
Spiegel, M.F.: Using the ORATOR synthesizer for a public reverse-directory service: design, lessons, and recommendations. In: EUROSPEECH 1993, pp. 1897–1900 (1993)
Google Scholar
Stolcke, A.: SRILM-An extensible language modeling toolkit. In: Proc. ICSLP 2002, vol. 2, pp. 901–904 (2002)
Google Scholar
Tsai, M.-Y., Chou, F.-C., Lee, L.-S.: Pronunciation modeling with reduced confusion for mandarin chinese using a three-stage framework. IEEE Transactions on Audio, Speech and Language Processing 15(2), 661–675 (2007)
Article Google Scholar
Van Rijsbergen, C.J.: Information Retrieval. Butterworths, London (1979)
Google Scholar
Weintraub, M., Fosler, E., Galles, C., Kao, Y.-H., Khudanpur, S., Saraclar, M., Wegmann, S.: WS96 project report:Automatic learning of word pronunciation from data. In: JHU Workshop Pronunciation Group (1996)
Google Scholar
Wester, M.: Pronunciation modeling for ASR- Knowledge-based and data-driven methods. Comput. Speech Lang., 69–85 (2003)
Google Scholar

Download references

Author information

Authors and Affiliations

Spoken Language Processing Group, LIMSI-CNRS, 91403, Orsay, France
Panagiota Karanasou & Lori Lamel

Authors

Panagiota Karanasou
View author publications
You can also search for this author in PubMed Google Scholar
Lori Lamel
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

School of Computer Science, Reykjavik University, Kringlan 1, 103, Reykjavik, Iceland
Hrafn Loftsson
Department of Icelandic, University of Iceland, Árnagardur v/Sudurgötu, 101, Reykjavik, Iceland
Eiríkur Rögnvaldsson
Arni Magnusson Institute for Icelandic Studies, Neshagi 16, 101, Reykjavik, Iceland
Sigrún Helgadóttir

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Karanasou, P., Lamel, L. (2010). Comparing SMT Methods for Automatic Generation of Pronunciation Variants. In: Loftsson, H., Rögnvaldsson, E., Helgadóttir, S. (eds) Advances in Natural Language Processing. NLP 2010. Lecture Notes in Computer Science(), vol 6233. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-14770-8_20

Download citation

DOI: https://doi.org/10.1007/978-3-642-14770-8_20
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-14769-2
Online ISBN: 978-3-642-14770-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics