Abstract
Modifications of prosodic parameters in concatenative synthesis systems may lead to a degradation in speech quality, especially when significant pitch changes are accomplished. Aiming to avoid large changes in the speech signal parameters, the speech corpus should present segments with phonetic and prosodic features close to the predicted ones. This condition is more often fulfilled by a speech corpus specially designed to be both phonetic and prosodically rich. The design of this corpus is strongly dependent on the script chosen for recording. For such, a procedure to select the recording script of a TTS system is proposed for the Brazilian Portuguese language. Selected sentences include declarative, exclamatory, and interrogative ones. Phonetic and prosodic information are firstly represented as a set of feature vectors. Next, the amount of distinct feature vectors is used as a fitness value for a genetic-based sentence selection. Experimental results point out a considerable improvement in script variability for speech synthesis applications.
Keywords
This work was partially supported by the Brazilian National Council for Scientific and Technological Development (CNPq), Studies and Projects Funding Body (FINEP), and Dígitro Tecnologia Ltda.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Deller Jr., J.R., Hansen, J.H.L., Proakis, J.G.: Discrete-Time Processing of Speech Signals. IEEE Press, New York (2000)
Huang, X., Acero, A., Hon, H.-W.: Spoken Language Processing: A Guide to Theory, Algorithm and System Development. Prentice Hall PTR, Upper Saddle River (2001)
Hunt, A.J., Black, A.W.: Unit Selection in a Concatenative Speech Synthesis System Using a Large Speech Database. In: Proceedings of ICASSP, Atlanta, USA, vol. 1, pp. 373–376 (1996)
Schroeter, J.: Text-to-Speech Synthesis. In: Schroeter, J. (ed.) Circuits, Signals, and Speech and Image Processing, 3rd edn. Taylor & Francis Group, Abington (2006)
Sak, H., Güngör, T., Safkan, Y.: A Corpus-Based Concatenative Speech Synthesis System for Turkish. Turkish Journal of Electrical Engineering, and Computer Sciences 14(2), 209–223 (2006)
Zhu, W., Zhang, W., Shi, Q., et al.: Corpus Building for Data-Driven TTS Systems. In: Proceedings of TTS, Santa Monica, USA, pp. 199–202 (2002)
Pitrelli, J.F., Bakis, R., Eide, E.M., et al.: The IBM Expressive Text-to-Speech Synthesis System for American English. IEEE Transactions on Speech and Audio Processing 14(4), 1099–1108 (2006)
Nicodem, M.V., Seara, R., Pacheco, F.S.: Reducing the Natural Click Effect within Database for High Quality Corpus-Based Speech Synthesis. In: ISSPA, Sydney, Australia, pp. 607–610 (2005)
Nicodem, M.V., Seara, R.: Natural Click Processing Through Wavelet Analysis and Extrapolation for Speech Enhancement. In: ITS, Fortaleza, Brazil, pp. 600–605 (2006)
Seara, I.C.: Statistical Study of the Phonemes Spoken in the Capital of Santa Catarina for the Elaboration of Phonetically Balanced Sentences. Master’s thesis, Federal University of Santa Catarina, Florianópolis, Brazil (in Portuguese) (1994)
Cirigliano, R., Monteiro, C., Barbosa, F., et al.: A Set of 1000 Brazilian Portuguese Phonetically Balanced Sentences Obtained Using the Genetic Algorithm Approach. In: SBrT, Campinas, Brazil, pp. 544–549 (2005) (in Portuguese)
Chou, F.–C., Tseng, C.–Y.: The Design of Prosodically Oriented Mandarin Speech Database. In: ICPhs, San Francisco, USA, pp. 2375–2377 (1999)
Li, Z., Harman, M., Hierons, R.M.: Search Algorithms for Regression Test Case Prioritization. IEEE Transactions on Software Engineering 33(4), 225–237 (2007)
Nicodem, M.V., Seara, I.C., Seara, R., dos Anjos, D.: Recording Script Design for a Brazilian Portuguese TTS System Aiming at a Higher Phonetic and Prosodic Variability. In: Proceedings of ISSPA, Sharjah, United Arab Emirates, pp. 1–4 (2007)
Seara, I.C., Pacheco, F.S., Seara Jr., R., et al.: Automatic Generation of Brazilian Portuguese Variants Aiming at Speech Recognition Systems. In: Proceedings of SBrT, Rio de Janeiro, Brazil, pp. 1–6 (2003) (in Portuguese)
Silva, D.C., Lima, A.A., de Maia, R., et al.: A Rule-Based Grapheme-Phone Converter and Stress Determination for Brazilian Portuguese Natural Language Processing. In: Proceeding of ITS, Fortaleza, Brazil, pp. 992–996 (2006)
Malfrére, F., Dutoit, T., Hertens, P.: Automatic Prosody Generation Using Suprasegmental Unit Selection. In: SSW, Jenolan Caves, Australia, pp. 323–328 (1998)
Seara, I., Kafka, S., Klein, S., Seara, R.: Vowel Sound Alternation of Verbs and Nouns of the Portuguese Spoken in Brazil for Application in TTS Synthesis. Journal of the Brazilian Telecommunications Society 17(1), 79–85 (2002) (in Portuguese)
Hasan, M.M., Lua, K.–T.: Neural Networks in Chinese Lexical Classification. In: PACLIC, Seoul, South Korea, pp. 119–128 (1996)
Ciaramita, M., Hofmann, T., Johnson, M.: Hierarchical Semantic Classification: Word Sense Disambiguation with World Knowledge. In: IJCAI, Acapulco, Mexico, pp. 817–822 (2003)
Cagliari, L.C.: Phonological Analysis: Introduction to Theory and Practice with Special Emphasis to the Phonemic Model, Mercado Letras, Campinas, Brazil (2002)
Sândalo, M.F.S.: Prosodic Phonology and Optimality Theory: Reflexions about the Interface Syntax-Phonology in the Generation of Phonological Phrases. Revista de Estudos da Linguagem 12(2), 319–344 (2004)
Truckenbrodt, H.: On the Relation between Syntactic Phrases and Phonological Phrases. Linguistic Inquiry 30(2), 219–255 (1999)
Yoon, K.: A Prosodic Phrasing Model for a Korean Text-to-Speech Synthesis System. Computer, Speech, and Language 20(1), 69–79 (2006)
Nicodem, M.V., Seara, I.C., Seara, R., dos Anjos, D., Seara, J.R.: Automatic Selection of Text Corpus for Speech Synthesis Systems. In: SBrT, Recife, Brazil, pp. 1–6 (2007) (in Portuguese)
Seara, I.C., Nicodem, M.V., Seara, R., Seara Jr., R.: Phrasal Classification Focusing Speech Synthesis: Rules for Brazilian Portuguese. In: SBrT, Recife, Brazil, pp. 1–6 (2007) (in Portuguese)
Tang, K.S., Man, K.F., Kwong, S., et al.: Genetic Algorithms and their Applications. IEEE Signal Processing Magazine 13(6), 22–37 (1996)
Johnson, J.M., Rahmat-Samii, V.: Genetic Algorithms in Engineering Electromagnetics. IEEE Antennas and Propagation Magazine 39(4), 7–21 (1997)
Hetland, M.L.: Beginning Python: From Novice to Professional. Apress (2005)
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 2008 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Nicodem, M.V., Seara, I.C., dos Anjos, D., Seara, R., Seara, R. (2008). Evolutionary-Based Design of a Brazilian Portuguese Recording Script for a Concatenative Synthesis System. In: Teixeira, A., de Lima, V.L.S., de Oliveira, L.C., Quaresma, P. (eds) Computational Processing of the Portuguese Language. PROPOR 2008. Lecture Notes in Computer Science(), vol 5190. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-85980-2_9
Download citation
DOI: https://doi.org/10.1007/978-3-540-85980-2_9
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-85979-6
Online ISBN: 978-3-540-85980-2
eBook Packages: Computer ScienceComputer Science (R0)