Abstract
In this paper a new voice conversion algorithm is presented, which transforms the utterance of a source speaker into the utterance of a target speaker. The voice conversion approach is based on pitch synchronous speech analysis, Discrete Cosine Transform (DCT), nonlinear spectral warping with spectrum interpolation and pitch synchronous speech synthesis with overlapping using the speech production model. The DCT speech model contains also information about the phase properties of the modeled speech frame, but is, in contrary to a model based e.g. on the discrete Fourier transform, a real model and can be efficiently used for speech coding and voice conversion. The resulting finite impulse response of the converted DCT speech model is obtained by the inverse DCT and it is of the mixed phase type. The proposed voice conversion procedure results in speech with high naturalness.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Moulines, E., Sagisaka, Y.(eds.): Voice Conversion: State of the Art and Perspectives. Special Issue of Speech Communication 16(2) (1995)
Kain, A.B.: High Resolution Voice Transformation. PhD Thesis, Oregon Graduate Institute of Science and Technology (2001)
PÅ™ibilovĂ¡, A., PÅ™ibil, J.: Non-linear Frequency Scale Mapping for Voice Conversion in Text-To-Speech System with Cepstral Description. Speech Communication 48(12), 1691–1703 (2006)
Vondra, M.: Voice Transformation in Vocoders and TTS Systems. PhD Dissertation, Brno University of Technology (2005) (in Czech)
Nemsak, S.: Pitch Shifting and Voice Transformation Using PSOLA. In: Vich, R. (ed.) Proc. of the 13th Czech-German Workshop on Speech Processing, Prague, September 15-17, pp. 38–41 (2003)
Vondra, M., VĂch, R.: Speech Identity Conversion. In: Chollet, G., Esposito, A., FaĂºndez-Zanuy, M., Marinaro, M. (eds.) Nonlinear Speech Modeling. LNCS (LNAI), vol. 3445, pp. 421–426. Springer, Heidelberg (2005)
Vondra, M., VĂch, R.: Speech Modeling Using the Complex Cepstrum. In: Esposito, A., Esposito, A.M., Martone, R., MĂ¼ller, V.C., Scarpetta, G. (eds.) COST 2102 Int. Training School 2010. LNCS, vol. 6456, pp. 324–330. Springer, Heidelberg (2011)
Vondra, M., VĂch, R.: Modification of the Glottal Voice Characteristics Based on Changing the Maximum-Phase Speech Component. In: Esposito, A., Vinciarelli, A., Vicsi, K., Pelachaud, C., Nijholt, A. (eds.) Communication and Enactment 2010. LNCS, vol. 6800, pp. 240–251. Springer, Heidelberg (2011)
Vich, R.: Pitch Synchronous Linear Predictive Czech and Slovak Text-to-Speech Synthesis. In: Proc. of the 15th International Congress on Acoustics, ICA 1995, Trondheim, Norway, vol. III, pp. 181–184 (1995)
Vich, R.: Cepstral Speech Model, Padé Approximation, Excitation and Gain Matching in Cepstral Speech Synthesis. In: Jan, J. (ed.) BIOSIGNAL 2000 VUTIUM, Brno, pp. 77–82 (2000)
Oppenheim, A.V., Schafer, R.W., Buck, J.R.: Discrete-Time Signal Processing. Prentice Hall, New Jersey (1999)
Zelinski, R., Noll, P.: Adaptive Coding of Speech Signals. IEEE Transactions on Acoustics, Speech and Signal Processing ASSP-25(4), 199–309 (1977)
Tribolet, J.M., Crochiere, R.E.: Frequency Domain Coding of Speech. IEEE Transactions on Acoustics, Speech and Signal Processing ASSP-27(5), 512–530 (1979)
Vondra, M., VĂch, R.: Speech Emotion Modification Using a Cepstral Vocoder. In: Esposito, A., Campbell, N., Vogel, C., Hussain, A., Nijholt, A. (eds.) COST 2102 Int. Training School 2009. LNCS, vol. 5967, pp. 280–285. Springer, Heidelberg (2010)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2012 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
VĂch, R., Vondra, M. (2012). Pitch Synchronous Transform Warping in Voice Conversion. In: Esposito, A., Esposito, A.M., Vinciarelli, A., Hoffmann, R., MĂ¼ller, V.C. (eds) Cognitive Behavioural Systems. Lecture Notes in Computer Science, vol 7403. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-34584-5_24
Download citation
DOI: https://doi.org/10.1007/978-3-642-34584-5_24
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-34583-8
Online ISBN: 978-3-642-34584-5
eBook Packages: Computer ScienceComputer Science (R0)