Building and Training of a New Mexican Spanish Voice for Festival

Espinosa, Humberto Pérez; García, Carlos Alberto Reyes

doi:10.1007/11579427_89

Humberto Pérez Espinosa²¹ &
Carlos Alberto Reyes García²¹

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 3789))

Included in the following conference series:

Mexican International Conference on Artificial Intelligence

1069 Accesses

Abstract

In this paper we describe the work done to build a new voice based on diphone concatenation in the Spanish spoken in Mexico. This voice is compatible with the Text to Speech Synthesis System Festival. In the development of each module of the system the own features of Spanish were taken into account. In this work we hope to enhance the naturalness of the synthesized voice by including a prosodic model. The prosodic factors taken into consideration by the model are: phrasing, accentuation, duration and F0 contour. Duration and F0 prediction models were trained from natural speech corpora. We found the best prediction models by testing several machine learning methods and two different corpora. The paper describes the building, and training process as well as the results and their respective interpretation.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 129.00; Price excludes VAT (USA)

Softcover Book: USD 169.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Dutoit, T.: An Introduction to Text-to-Speech Synthesis. Kluwer Academic Publishers, Dordrecht (1997) (Text, Speech and Language Technology, 3)
Google Scholar
The Centre for Speech Technology Research (CSTR): The University of Edimburgh. Internet Page, http://www.cstr.ed.ac.uk/projects/festival/
Meza, H.: Modelos Estadísticos de Duraciones de los Fonemas en el Español Mexicano. Master Thesis, Universidad de las Américas - Puebla, Dept. of Computer Systems Engineering (1999)
Google Scholar
Black, A., Taylor, P., Macon, M.: Speech Synthesis in Festival A practical course on making computers talk Edition 2.0, for Festival Version 1.4.1 (2000)
Google Scholar
Black, A., Lenzo, K.: Building Synthetic Voices for FestVox 2.0 Edition (2003)
Google Scholar
Barbosa, A.: Desarrollo de una nueva voz en Español de México para el Sistema de Texto a Voz Festival, Master Thesis, Universidad de las Américas - Puebla, Dept. of Computer Systems Engineering (1997)
Google Scholar
Black, A., Taylor, P., Macon, M.: The Festival Speech Synthesis System: System documentation. Technical Report HCRC/TR-83, Human Communciation Research Centre, University of Edinburgh, Scotland, UK (1997)
Google Scholar
Jun, S.-A.: Prosodic Phrasing and Attachment Preferences. Journal of Psycholinguistic Research 32(2), 219–249 (2003)
Article Google Scholar
Schötz, S.: Prosody in Relation to Paralinguistic Phonetics - Earlier and Recent Definitions, Distinctions and Discussions. Term paper for course in Prosody, Lund University, Dept. of Linguistics and Phonetics (2003)
Google Scholar
Pineda, L., Villaseñor, L., Cuetara, J., Castellanos, H., Lopez, I.: A New Phonetic and Speech Corpus for Mexican Spanish. In: Lemaître, C., Reyes, C.A., González, J.A. (eds.) IBERAMIA 2004. LNCS (LNAI), vol. 3315, pp. 974–983. Springer, Heidelberg (2004) ISSN: 0302-9743, ISBN 3-540-23806-9
Chapter Google Scholar
The University of Waikato: Hamilton, New Zealand Web Page, http://www.cs.waikato.ac.nz/ml/1999-2004

Download references

Author information

Authors and Affiliations

Instituto Nacional de Astrofísica Óptica y Electrónica, Luis Enrique Erro No. 1, Tonantzintla, Puebla, México
Humberto Pérez Espinosa & Carlos Alberto Reyes García

Authors

Humberto Pérez Espinosa
View author publications
You can also search for this author in PubMed Google Scholar
Carlos Alberto Reyes García
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

National Polytechnic Institute, Center for Computing Research, 07738, Mexico City, México
Alexander Gelbukh
Technológico de Monterrey (ITESM), Campus Ciudad de México (CCM), Calle del Puente 222, Col. Ejudos de Huipulco, 14360 DF, Tlalpan, Mexico
Álvaro de Albornoz
Center for Intelligent Systems, Tecnológico de Monterrey, Campus Monterrey, 64849, Monterrey, N.L., Mexico
Hugo Terashima-Marín

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Espinosa, H.P., García, C.A.R. (2005). Building and Training of a New Mexican Spanish Voice for Festival. In: Gelbukh, A., de Albornoz, Á., Terashima-Marín, H. (eds) MICAI 2005: Advances in Artificial Intelligence. MICAI 2005. Lecture Notes in Computer Science(), vol 3789. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11579427_89

Download citation

DOI: https://doi.org/10.1007/11579427_89
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-29896-0
Online ISBN: 978-3-540-31653-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics