Skip to main content

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 6456))

Abstract

Conventional cepstral speech modeling is based on the minimum phase parametric speech production model with infinite impulse response. In that approach only the logarithmic magnitude frequency response of the corresponding speech frame is approximated. In this contribution the principle of the cepstral speech modeling using the complex cepstrum is described. The obtained mixed-phase vocal tract model with finite impulse response contains also the information about the phase properties of the modeled speech frame. This model approximates the speech signal with higher accuracy than the model based on the real cepstrum, the numerical complexity and the memory requirements are at least twice greater.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Zen, H., Tokuda, K., Black, A.W.: Statistical Parametric Speech Synthesis. Speech Communication 51, 1039–1064 (2009)

    Article  Google Scholar 

  2. Vích, R.: Cepstral Speech Model, Padé Approximation, Excitation and Gain Matching in Cepstral Speech Synthesis. In: Jan, J. (ed.) BIOSIGNAL 2000, pp. 77–82. VUTIUM, Brno (2000)

    Google Scholar 

  3. Drugman, T., Moinet, A., Dutoit, T., Wilfart, G.: Using a Pitch-Synchronous Residual Codebook for Hybrid HMM/Frame Selection Speech Synthesis. In: IEEE ICASSP, Taipei, Taiwan, pp. 3793–3796 (2009)

    Google Scholar 

  4. Quatieri, T.F.: Discrete-Time Speech Signal Processing, pp. 253–308. Prentice-Hall, Englewood Cliffs (2002)

    Google Scholar 

  5. Drugman, T., Bozkurt, B.T., Dutoit, T.: Complex Cepstrum-based Decomposition of Speech for Glottal Source Estimation. In: Interspeech 2009, Brighton, U.K, pp. 116–119 (2009)

    Google Scholar 

  6. Oppenheim, A.V., Schafer, R.W.: Discrete-Time Signal Processing, pp. 768–825. Prentice-Hall, Englewood Cliffs (1989)

    MATH  Google Scholar 

  7. Vích, R.: Z-transform Theory and Application, pp. 207–216. D. Reidel Publ. Comp., Dordrecht (1987)

    MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2011 Springer-Verlag Berlin Heidelberg

About this chapter

Cite this chapter

Vondra, M., VĂ­ch, R. (2011). Speech Modeling Using the Complex Cepstrum. In: Esposito, A., Esposito, A.M., Martone, R., MĂ¼ller, V.C., Scarpetta, G. (eds) Toward Autonomous, Adaptive, and Context-Aware Multimodal Interfaces. Theoretical and Practical Issues. Lecture Notes in Computer Science, vol 6456. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-18184-9_27

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-18184-9_27

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-18183-2

  • Online ISBN: 978-3-642-18184-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics