Skip to main content

Building and Training of a New Mexican Spanish Voice for Festival

  • Conference paper
Book cover MICAI 2005: Advances in Artificial Intelligence (MICAI 2005)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 3789))

Included in the following conference series:

  • 1069 Accesses

Abstract

In this paper we describe the work done to build a new voice based on diphone concatenation in the Spanish spoken in Mexico. This voice is compatible with the Text to Speech Synthesis System Festival. In the development of each module of the system the own features of Spanish were taken into account. In this work we hope to enhance the naturalness of the synthesized voice by including a prosodic model. The prosodic factors taken into consideration by the model are: phrasing, accentuation, duration and F0 contour. Duration and F0 prediction models were trained from natural speech corpora. We found the best prediction models by testing several machine learning methods and two different corpora. The paper describes the building, and training process as well as the results and their respective interpretation.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Dutoit, T.: An Introduction to Text-to-Speech Synthesis. Kluwer Academic Publishers, Dordrecht (1997) (Text, Speech and Language Technology, 3)

    Google Scholar 

  2. The Centre for Speech Technology Research (CSTR): The University of Edimburgh. Internet Page, http://www.cstr.ed.ac.uk/projects/festival/

  3. Meza, H.: Modelos Estadísticos de Duraciones de los Fonemas en el Español Mexicano. Master Thesis, Universidad de las Américas - Puebla, Dept. of Computer Systems Engineering (1999)

    Google Scholar 

  4. Black, A., Taylor, P., Macon, M.: Speech Synthesis in Festival A practical course on making computers talk Edition 2.0, for Festival Version 1.4.1 (2000)

    Google Scholar 

  5. Black, A., Lenzo, K.: Building Synthetic Voices for FestVox 2.0 Edition (2003)

    Google Scholar 

  6. Barbosa, A.: Desarrollo de una nueva voz en Español de México para el Sistema de Texto a Voz Festival, Master Thesis, Universidad de las Américas - Puebla, Dept. of Computer Systems Engineering (1997)

    Google Scholar 

  7. Black, A., Taylor, P., Macon, M.: The Festival Speech Synthesis System: System documentation. Technical Report HCRC/TR-83, Human Communciation Research Centre, University of Edinburgh, Scotland, UK (1997)

    Google Scholar 

  8. Jun, S.-A.: Prosodic Phrasing and Attachment Preferences. Journal of Psycholinguistic Research 32(2), 219–249 (2003)

    Article  Google Scholar 

  9. Schötz, S.: Prosody in Relation to Paralinguistic Phonetics - Earlier and Recent Definitions, Distinctions and Discussions. Term paper for course in Prosody, Lund University, Dept. of Linguistics and Phonetics (2003)

    Google Scholar 

  10. Pineda, L., Villaseñor, L., Cuetara, J., Castellanos, H., Lopez, I.: A New Phonetic and Speech Corpus for Mexican Spanish. In: Lemaître, C., Reyes, C.A., González, J.A. (eds.) IBERAMIA 2004. LNCS (LNAI), vol. 3315, pp. 974–983. Springer, Heidelberg (2004) ISSN: 0302-9743, ISBN 3-540-23806-9

    Chapter  Google Scholar 

  11. The University of Waikato: Hamilton, New Zealand Web Page, http://www.cs.waikato.ac.nz/ml/1999-2004

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2005 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Espinosa, H.P., García, C.A.R. (2005). Building and Training of a New Mexican Spanish Voice for Festival. In: Gelbukh, A., de Albornoz, Á., Terashima-Marín, H. (eds) MICAI 2005: Advances in Artificial Intelligence. MICAI 2005. Lecture Notes in Computer Science(), vol 3789. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11579427_89

Download citation

  • DOI: https://doi.org/10.1007/11579427_89

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-29896-0

  • Online ISBN: 978-3-540-31653-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics