Pitch Synchronous Transform Warping in Voice Conversion

Vích, Robert; Vondra, Martin

doi:10.1007/978-3-642-34584-5_24

Robert Vích²¹ &
Martin Vondra²¹

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 7403))

2767 Accesses
2 Citations

Abstract

In this paper a new voice conversion algorithm is presented, which transforms the utterance of a source speaker into the utterance of a target speaker. The voice conversion approach is based on pitch synchronous speech analysis, Discrete Cosine Transform (DCT), nonlinear spectral warping with spectrum interpolation and pitch synchronous speech synthesis with overlapping using the speech production model. The DCT speech model contains also information about the phase properties of the modeled speech frame, but is, in contrary to a model based e.g. on the discrete Fourier transform, a real model and can be efficiently used for speech coding and voice conversion. The resulting finite impulse response of the converted DCT speech model is obtained by the inverse DCT and it is of the mixed phase type. The proposed voice conversion procedure results in speech with high naturalness.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Moulines, E., Sagisaka, Y.(eds.): Voice Conversion: State of the Art and Perspectives. Special Issue of Speech Communication 16(2) (1995)
Google Scholar
Kain, A.B.: High Resolution Voice Transformation. PhD Thesis, Oregon Graduate Institute of Science and Technology (2001)
Google Scholar
Přibilová, A., Přibil, J.: Non-linear Frequency Scale Mapping for Voice Conversion in Text-To-Speech System with Cepstral Description. Speech Communication 48(12), 1691–1703 (2006)
Article Google Scholar
Vondra, M.: Voice Transformation in Vocoders and TTS Systems. PhD Dissertation, Brno University of Technology (2005) (in Czech)
Google Scholar
Nemsak, S.: Pitch Shifting and Voice Transformation Using PSOLA. In: Vich, R. (ed.) Proc. of the 13th Czech-German Workshop on Speech Processing, Prague, September 15-17, pp. 38–41 (2003)
Google Scholar
Vondra, M., Vích, R.: Speech Identity Conversion. In: Chollet, G., Esposito, A., Faúndez-Zanuy, M., Marinaro, M. (eds.) Nonlinear Speech Modeling. LNCS (LNAI), vol. 3445, pp. 421–426. Springer, Heidelberg (2005)
Chapter Google Scholar
Vondra, M., Vích, R.: Speech Modeling Using the Complex Cepstrum. In: Esposito, A., Esposito, A.M., Martone, R., Müller, V.C., Scarpetta, G. (eds.) COST 2102 Int. Training School 2010. LNCS, vol. 6456, pp. 324–330. Springer, Heidelberg (2011)
Chapter Google Scholar
Vondra, M., Vích, R.: Modification of the Glottal Voice Characteristics Based on Changing the Maximum-Phase Speech Component. In: Esposito, A., Vinciarelli, A., Vicsi, K., Pelachaud, C., Nijholt, A. (eds.) Communication and Enactment 2010. LNCS, vol. 6800, pp. 240–251. Springer, Heidelberg (2011)
Google Scholar
Vich, R.: Pitch Synchronous Linear Predictive Czech and Slovak Text-to-Speech Synthesis. In: Proc. of the 15th International Congress on Acoustics, ICA 1995, Trondheim, Norway, vol. III, pp. 181–184 (1995)
Google Scholar
Vich, R.: Cepstral Speech Model, Padé Approximation, Excitation and Gain Matching in Cepstral Speech Synthesis. In: Jan, J. (ed.) BIOSIGNAL 2000 VUTIUM, Brno, pp. 77–82 (2000)
Google Scholar
Oppenheim, A.V., Schafer, R.W., Buck, J.R.: Discrete-Time Signal Processing. Prentice Hall, New Jersey (1999)
Google Scholar
Zelinski, R., Noll, P.: Adaptive Coding of Speech Signals. IEEE Transactions on Acoustics, Speech and Signal Processing ASSP-25(4), 199–309 (1977)
Google Scholar
Tribolet, J.M., Crochiere, R.E.: Frequency Domain Coding of Speech. IEEE Transactions on Acoustics, Speech and Signal Processing ASSP-27(5), 512–530 (1979)
Article Google Scholar
Vondra, M., Vích, R.: Speech Emotion Modification Using a Cepstral Vocoder. In: Esposito, A., Campbell, N., Vogel, C., Hussain, A., Nijholt, A. (eds.) COST 2102 Int. Training School 2009. LNCS, vol. 5967, pp. 280–285. Springer, Heidelberg (2010)
Chapter Google Scholar

Download references

Author information

Authors and Affiliations

Institute of Photonics and Electronics, Academy of Sciences of the Czech Republic, Chaberska 57, CZ 18251, Prague 8, Czech Republic
Robert Vích & Martin Vondra

Authors

Robert Vích
View author publications
You can also search for this author in PubMed Google Scholar
Martin Vondra
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Psychology, and IIASS, Seconda Università degli Studi di Napoli, Italy
Anna Esposito
Istituto Nazionale di Geofisica e Vulcanologia, sezione di Napoli Osservatorio Vesuviano, Napoli, Italy
Antonietta M. Esposito
School of Computing Science, University of Glasgow, Glasgow, UK
Alessandro Vinciarelli
Laboratory of Acoustics and Speech Communication, Technische Universität Dresden, 01062, Dresden, Germany
Rüdiger Hoffmann
Dept. of Humanities and Social Sciences, Anatolia College/ACT, P.O. Box 21021, 55510, Pylaia, Greece
Vincent C. Müller

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Vích, R., Vondra, M. (2012). Pitch Synchronous Transform Warping in Voice Conversion. In: Esposito, A., Esposito, A.M., Vinciarelli, A., Hoffmann, R., Müller, V.C. (eds) Cognitive Behavioural Systems. Lecture Notes in Computer Science, vol 7403. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-34584-5_24

Download citation

DOI: https://doi.org/10.1007/978-3-642-34584-5_24
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-34583-8
Online ISBN: 978-3-642-34584-5
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics