Skip to main content

Synthesis of Codebooks with Perceptually Monitored Structure for Multiband CELP-Coders

  • Conference paper
Biometrics, Computer Security Systems and Artificial Intelligence Applications

Abstract

The work is devoted to subband decomposition scheme and training set composition effect on quality of reconstructed speech when synthesising codebooks for multiband CELP-coders. Codebooks quality and its dependence on the codebooks structure are studied. The research work presents multiband codebook with multistage organization and reconfigurable structure optimized by SNR, Bark Spectrum Distortion (BSD), Modified Bark Spectrum Distortion (MBSD) and Noise-to-Mask Ratio (NMR) criteria for control by psychoacoustic model on the base of Warped Discrete-Fourier Transform (WDFT).

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 169.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 219.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 219.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. W. Yang, M. Benbouchta, and R. Yantorno, “Performance of a modified bark spectral distortion measure as an objective speech quality measure,”// IEEE ICASSP, Seattle, (1998), pp.541–544

    Google Scholar 

  2. K. Brandenburg, T. Sporer: “NMR” and “Masking Flag”: Evaluation of Quality Using Perceptual Criteria, // Proc. of the 11th Int. Conv. Aud. Eng. Soc., “Test and measurement”, Portland, USA, May (1992), pp. 169–179

    Google Scholar 

  3. M. Parfieniuk, A. Petrovsky: Warped DFT as the basis for psychoacoustic model // The proc. of the IEEE International conference on Acoustic, Speech, Signal processing, ICASSP, vol. IV, May (2004), Montreal, Canada, pp. 185–188

    Google Scholar 

  4. P. Menardi, G.A. Mian, G. Riccardi: Dynamic Bit Allocation in Subband Coding of Wideband Audio with Multipulse LPC // Proc. of EUSIPCO, Edinburgh, September, (1994), pp.1453–1456

    Google Scholar 

  5. A. Ubale, A. Gersho: A Low-Delay Wideband Speech Coder at 24 kbps // Proc. of ICASSP, Seatle, (1998), pp.165–168

    Google Scholar 

  6. Alexis Bernard, Abeer Alwan: Perceptually Based and Embedded Wideband CELP Coding of Speech // Proc. of Eurospeech, (1999), pp.1543–1546

    Google Scholar 

  7. A. Makur: Derivation of Subband Coding Gain: The Most General Case, www.ntu.edu.sg/home/eamakur/codinggain.pdf

    Google Scholar 

  8. E. Zwicker, H. Fastl: Psychoacoustics Facts and Models, Springer-Verlag, Berlin Heidelberg, (1990)

    Google Scholar 

  9. Linde, Y., Buzo, A., and Gray, R.M.: An algorithm for vector quantizer design // IEEE Transactions on Communications, vol. COM-28, Jan. (1980), pp. 84–95

    Article  Google Scholar 

  10. Gersho, A. and Gray, R.: Vector quantization and signal compression, Boston, Kluwer Academic Publishers, (1992)

    MATH  Google Scholar 

  11. ISO/IEC JTC1/SC29/WG11, MPEG, International Standard IS 13818-3 Information technology — Generic Coding of Moving Pictures and Associated Audio, (1994)

    Google Scholar 

  12. M.Z. Livshitz, M. Parfieniuk, A.A. Petrovsky: Multistage Vector Quantization with Variable Dimension in Perceptual Speech Encoders with Psychoacoustic Model Based on Warped DFT // Proc. of the 7th International Conference on Digital Signal Processing and its Applications, vol.VII-1, Moscow, Russia, (2005), pp.187–191

    Google Scholar 

  13. M.Z. Livshitz, M. Parfieniuk, A.A. Petrovsky: Wideband CELP-coder with Multiband Excitation and Multistage Quantization under Reconfigurable Structure Codebook // Digital Signal Processing, Moscow, Russia, vol.2, (2005), pp.20–35 (in Russian)

    Google Scholar 

  14. A. Petrovsky, M. Parfieniuk, K. Bielawski: Psychoacoustically Motivated Non-uniform Cosine Modulated Polyphase Filter Bank // 2nd International Workshop on Spectral Methods and Multirate Signal Processing (SMMSP 2002), Toulouse, France, September 7–8, (2002), pp.95–101

    Google Scholar 

  15. Michael Livshitz and Alexander Petrovsky: Perceptually Constrained Variable Bitrate Wideband Speech Coder // The Proc. of EUROCON, Serbia & Montenegro, Belgrade, November 22–24, (2005), pp.1296–1299

    Google Scholar 

  16. DARPA TIMIT Acoustic-Phonetic Continuous Speech Corpus, Department of Commerce, NIST, Springfield, Virginia, Oct. (1990)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2006 Springer Science+Business Media, LLC

About this paper

Cite this paper

Livshitz, M., Petrovsky, A. (2006). Synthesis of Codebooks with Perceptually Monitored Structure for Multiband CELP-Coders. In: Saeed, K., Pejaś, J., Mosdorf, R. (eds) Biometrics, Computer Security Systems and Artificial Intelligence Applications. Springer, Boston, MA. https://doi.org/10.1007/978-0-387-36503-9_6

Download citation

  • DOI: https://doi.org/10.1007/978-0-387-36503-9_6

  • Publisher Name: Springer, Boston, MA

  • Print ISBN: 978-0-387-36232-8

  • Online ISBN: 978-0-387-36503-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics