Skip to main content

Comparing SMT Methods for Automatic Generation of Pronunciation Variants

  • Conference paper
Advances in Natural Language Processing (NLP 2010)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 6233))

Included in the following conference series:

Abstract

Multiple-pronunciation dictionaries are often used by automatic speech recognition systems in order to account for different speaking styles. In this paper, two methods based on statistical machine translation (SMT) are used to generate multiple pronunciations from the canonical pronunciation of a word. In the first method, a machine translation tool is used to perform phoneme-to-phoneme (p2p) conversion and derive variants from a given canonical pronunciation. The second method is based on a pivot method proposed for the paraphrase extraction task. The two methods are compared under different training conditions which allow single or multiple pronunciations in the training set, and their performance is evaluated in terms of recall and precision measures.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Adda-Decker, M., Lamel, L.: Pronunciation variants across system configuration, language and speaking style. Speech Communication 29, 83–98 (1999)

    Article  Google Scholar 

  2. Bannard, C., Callison-Burch, C.: Paraphrasing with bilingual parallel corpora. In: Proc. of ACL, pp. 597–604 (2005)

    Google Scholar 

  3. Divay, M., Vitale, A.-J.: Algorithms for grapheme-phoneme translation for English and French: Applications for database searches and speech synthesis. Computational linguistics 23(4), 495–523 (1997)

    Google Scholar 

  4. Fukada, T., Yoshimura, T., Sagisaka, Y.: Automatic generation of multiple pronunciations based on neural networks. Speech Communication 27(1), 63–73 (1999)

    Article  Google Scholar 

  5. Gerosa, M., Federico, M.: Coping with out-of-vocabulary words: open versus huge vocabulary ASR. In: ICASSP, pp. 4313–4316 (2009)

    Google Scholar 

  6. Heuvel, H., van de Reveil B., Martens, J.-P.: Pronunciation-based ASR for names. In: Proc. of Interspeech, pp. 2991–2994 (2009)

    Google Scholar 

  7. Koehn, P., et al.: Moses: Open source toolkit for statistical machine translation. In: Proc. of ACL (2007)

    Google Scholar 

  8. Lamel, L., Adda, G.: On designing pronunciation lexicons for large vocabulary, continuous speech recognition. In: Proc. ICSLP 1996, pp. 6–9 (1996)

    Google Scholar 

  9. Laurent, A., Deleglise, P., Meignier, S.: Grapheme to phoneme conversion using a SMT system. In: Proc. of Interspeech (2009)

    Google Scholar 

  10. Papineni, K., Roukos, S., Ward, T., Zhu, W.J.: Bleu: a method for automatic evaluation of machine translation. In: Proc. of ACL, pp. 311–318 (2002)

    Google Scholar 

  11. Quirk, C., Brockett, C., Dolan, W.: Monolingual Machine Translation for Paraphrase Generation. In: Proc. of EMNLP, pp. 142–149 (2004)

    Google Scholar 

  12. Spiegel, M.F.: Using the ORATOR synthesizer for a public reverse-directory service: design, lessons, and recommendations. In: EUROSPEECH 1993, pp. 1897–1900 (1993)

    Google Scholar 

  13. Stolcke, A.: SRILM-An extensible language modeling toolkit. In: Proc. ICSLP 2002, vol. 2, pp. 901–904 (2002)

    Google Scholar 

  14. Tsai, M.-Y., Chou, F.-C., Lee, L.-S.: Pronunciation modeling with reduced confusion for mandarin chinese using a three-stage framework. IEEE Transactions on Audio, Speech and Language Processing 15(2), 661–675 (2007)

    Article  Google Scholar 

  15. Van Rijsbergen, C.J.: Information Retrieval. Butterworths, London (1979)

    Google Scholar 

  16. Weintraub, M., Fosler, E., Galles, C., Kao, Y.-H., Khudanpur, S., Saraclar, M., Wegmann, S.: WS96 project report:Automatic learning of word pronunciation from data. In: JHU Workshop Pronunciation Group (1996)

    Google Scholar 

  17. Wester, M.: Pronunciation modeling for ASR- Knowledge-based and data-driven methods. Comput. Speech Lang., 69–85 (2003)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2010 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Karanasou, P., Lamel, L. (2010). Comparing SMT Methods for Automatic Generation of Pronunciation Variants. In: Loftsson, H., Rögnvaldsson, E., Helgadóttir, S. (eds) Advances in Natural Language Processing. NLP 2010. Lecture Notes in Computer Science(), vol 6233. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-14770-8_20

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-14770-8_20

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-14769-2

  • Online ISBN: 978-3-642-14770-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics