Abstract
In this paper, we consider the problem of accent modification between Castilian Spanish and Mexican Spanish. This is an interesting application area for tasks such as the automatic dubbing of pictures and videos with different accents. We initially apply statistical parametric speech synthesis to produce two artificial voices, each with the required accent, using Hidden Markov Models (HMM). This type of speech synthesis technique is capable of learning and reproducing certain essential parameters of the voice in question. We then propose a way to adapt these parameters between the two accents. The prosodic differences in the voices are modeled and transformed directly using this adaptation method. In order to produce the voices initially, we use a speech database that was developed by professional actors from Spain and Mexico. The results obtained from subjective and objective tests are promising, and the method is essentially applicable to accent modification between other Spanish accents.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Hermansky, H.: Should recognizers have ears? Speech Commun. 25(1), 3–27 (1998)
Tokuda, K., Nankaku, Y., Toda, T., Zen, H., Yamagishi, J., Oura, K.: Speech synthesis based on hidden markov models. Proc. IEEE 101(5), 1234–1252 (2013)
Lazaridis, A., Khoury, E., Goldman, J.-P., Avanzi, M., Marcel, S., Garner, P.N.: Swiss french regional accent identification. In: Proceedings of Odyssey (2014)
Woehrling, C., de Mareüil, P.B.: Identification of regional accents in french: perception and categorization. In: INTERSPEECH (2006)
Leemann, A.: Comparative analysis of voice fundamental frequency behavior of four swiss german dialects: Elektronische daten, Ph.D. dissertation, Selbstverlag (2009)
Beckman, M., Daz-Campos, M., McGory, J.T., Morgan, T.A.: Intonation across spanish, in the tones and break indices framework. Probus 14(1), 9–36 (2002)
Kawahara, H.: Straight, exploitation of the other aspect of vocoder: perceptually isomorphic decomposition of speech sounds. Acoust. Sci. Technol. 27(6), 349–353 (2006)
Wu, Y.-J., Nankaku, Y., Tokuda, K.: State mapping based method for cross-lingual speaker adaptation in hmm-based speech synthesis. In: Interspeech, pp. 528–531 (2009)
Wu, Y.-J., King, S., Tokuda, K.: Cross-lingual speaker adaptation for HMM-based speech synthesis. In: 6th International Symposium on Chinese Spoken Language Processing, ISCSLP 2008, p. 14. IEEE (2008)
Liang, H., Dines, J., Saheer, L.: A comparison of supervised and unsupervised cross-lingual speaker adaptation approaches for HMM-based speech synthesis. In: 2010 IEEE International Conference on Acoustics Speech and Signal Processing (ICASSP), pp. 4598–4601. IEEE (2010)
Oura, K., Tokuda, K., Yamagishi, J., King, S., Wester, M.: Unsupervised cross-lingual speaker adaptation for HMM-based speech synthesis. In: 2010 IEEE International Conference on Acoustics Speech and Signal Processing (ICASSP), pp. 4594–4597. IEEE (2010)
Yoshimura, T., Hashimoto, K., Oura, K., Nankaku, Y., Tokuda, K.: Cross-lingual speaker adaptation based on factor analysis using bilingual speech data for HMM-based speech synthesis. In: 8th ISCA Speech Synthesis Workshop, pp. 317–322 (2013)
Nagahama, D., Nose, T., Koriyama, T., Kobayashi, T.: Transform mapping using shared decision tree context clustering for HMM-based cross-lingual speech synthesis. In: Fifteenth Annual Conference of the International Speech Communication Association (2014)
Gales, M.J.: Maximum likelihood linear transformations for HMM-based speech recognition. Comput. Speech Lang. 12(2), 75–98 (1998)
Acero, A., Deng, L., Kristjansson, T.T., Zhang, J.: HMM adaptation using vector taylor series for noisy speech recognition. In: INTERSPEECH, pp. 869–872 (2000)
Motlicek, P., Garner, P.N., Kim, N., Cho, J.: Accent adaptation using subspace gaussian mixture models. In: 2013 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 7170–7174. IEEE (2013)
Tamura, M., Masuko, T., Tokuda, K., Kobayashi, T.: Adaptation of pitch and spectrum for HMM-based speech synthesis using MLLR. In: Proceedings of 2001 IEEE International Conference on Acoustics, Speech, and Signal Processing, (ICASSP01), vol. 2, pp. 805–808. IEEE (2001)
Liang, H., Dines, J.: An analysis of language mismatch in HMM state mapping-based cross-lingual speaker adaptation. Technical report, Idiap (2010)
Llisterri, J., Mariño, J.B.: Spanish adaptation of sampa and automatic phonetic transcription. Reporte técnico del ESPRIT PROJECT, vol. 6819 (1993)
Caballero, M., Moreno, A., Nogueiras, A.: Data driven multidialectal phone set for spanish dialects. In: INTERSPEECH. Citeseer (2004)
Elra catalogue: Emotional speech synthesis database. http://catalog.elra.info. Accessed 30 Nov 2014
HTS: HMM speech synthesis system. http://hts.sp.nitech.ac.jp/. Accessed 20 Jan 2015
Yan, Q., Vaseghi, S., Rentzos, D., Ho, C.-H.: Analysis by synthesis of acoustic correlates of british, australian and american accents. In: Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing, (ICASSP 2004), vol. 1, p. I637. IEEE (2004)
Acknowledgements
This work was supported by the SEP and CONACyT under the Program SEP-CONACyT, CB-2012-01, No.182432, in Mexico, as well as the University of Costa Rica in Costa Rica. We also want to thank ELRA for supplying the original Emotional speech synthesis database.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2016 Springer International Publishing AG
About this paper
Cite this paper
Coto-Jiménez, M., Goddard-Close, J. (2016). Hidden Markov Models for Artificial Voice Production and Accent Modification. In: Montes y Gómez, M., Escalante, H., Segura, A., Murillo, J. (eds) Advances in Artificial Intelligence - IBERAMIA 2016. IBERAMIA 2016. Lecture Notes in Computer Science(), vol 10022. Springer, Cham. https://doi.org/10.1007/978-3-319-47955-2_34
Download citation
DOI: https://doi.org/10.1007/978-3-319-47955-2_34
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-47954-5
Online ISBN: 978-3-319-47955-2
eBook Packages: Computer ScienceComputer Science (R0)