Abstract
Conventional cepstral speech modeling is based on the minimum phase parametric speech production model with infinite impulse response. In that approach only the logarithmic magnitude frequency response of the corresponding speech frame is approximated. In this contribution the principle of the cepstral speech modeling using the complex cepstrum is described. The obtained mixed-phase vocal tract model with finite impulse response contains also the information about the phase properties of the modeled speech frame. This model approximates the speech signal with higher accuracy than the model based on the real cepstrum, the numerical complexity and the memory requirements are at least twice greater.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Zen, H., Tokuda, K., Black, A.W.: Statistical Parametric Speech Synthesis. Speech Communication 51, 1039–1064 (2009)
VĂch, R.: Cepstral Speech Model, PadĂ© Approximation, Excitation and Gain Matching in Cepstral Speech Synthesis. In: Jan, J. (ed.) BIOSIGNAL 2000, pp. 77–82. VUTIUM, Brno (2000)
Drugman, T., Moinet, A., Dutoit, T., Wilfart, G.: Using a Pitch-Synchronous Residual Codebook for Hybrid HMM/Frame Selection Speech Synthesis. In: IEEE ICASSP, Taipei, Taiwan, pp. 3793–3796 (2009)
Quatieri, T.F.: Discrete-Time Speech Signal Processing, pp. 253–308. Prentice-Hall, Englewood Cliffs (2002)
Drugman, T., Bozkurt, B.T., Dutoit, T.: Complex Cepstrum-based Decomposition of Speech for Glottal Source Estimation. In: Interspeech 2009, Brighton, U.K, pp. 116–119 (2009)
Oppenheim, A.V., Schafer, R.W.: Discrete-Time Signal Processing, pp. 768–825. Prentice-Hall, Englewood Cliffs (1989)
VĂch, R.: Z-transform Theory and Application, pp. 207–216. D. Reidel Publ. Comp., Dordrecht (1987)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2011 Springer-Verlag Berlin Heidelberg
About this chapter
Cite this chapter
Vondra, M., VĂch, R. (2011). Speech Modeling Using the Complex Cepstrum. In: Esposito, A., Esposito, A.M., Martone, R., MĂ¼ller, V.C., Scarpetta, G. (eds) Toward Autonomous, Adaptive, and Context-Aware Multimodal Interfaces. Theoretical and Practical Issues. Lecture Notes in Computer Science, vol 6456. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-18184-9_27
Download citation
DOI: https://doi.org/10.1007/978-3-642-18184-9_27
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-18183-2
Online ISBN: 978-3-642-18184-9
eBook Packages: Computer ScienceComputer Science (R0)