Synthesis of Codebooks with Perceptually Monitored Structure for Multiband CELP-Coders

Livshitz, Michael; Petrovsky, Alexander

doi:10.1007/978-0-387-36503-9_6

Michael Livshitz⁴ &
Alexander Petrovsky⁵

777 Accesses

Abstract

The work is devoted to subband decomposition scheme and training set composition effect on quality of reconstructed speech when synthesising codebooks for multiband CELP-coders. Codebooks quality and its dependence on the codebooks structure are studied. The research work presents multiband codebook with multistage organization and reconfigurable structure optimized by SNR, Bark Spectrum Distortion (BSD), Modified Bark Spectrum Distortion (MBSD) and Noise-to-Mask Ratio (NMR) criteria for control by psychoacoustic model on the base of Warped Discrete-Fourier Transform (WDFT).

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 169.00; Price excludes VAT (USA)

Softcover Book: USD 219.99; Price excludes VAT (USA)

Hardcover Book: USD 219.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

W. Yang, M. Benbouchta, and R. Yantorno, “Performance of a modified bark spectral distortion measure as an objective speech quality measure,”// IEEE ICASSP, Seattle, (1998), pp.541–544
Google Scholar
K. Brandenburg, T. Sporer: “NMR” and “Masking Flag”: Evaluation of Quality Using Perceptual Criteria, // Proc. of the 11th Int. Conv. Aud. Eng. Soc., “Test and measurement”, Portland, USA, May (1992), pp. 169–179
Google Scholar
M. Parfieniuk, A. Petrovsky: Warped DFT as the basis for psychoacoustic model // The proc. of the IEEE International conference on Acoustic, Speech, Signal processing, ICASSP, vol. IV, May (2004), Montreal, Canada, pp. 185–188
Google Scholar
P. Menardi, G.A. Mian, G. Riccardi: Dynamic Bit Allocation in Subband Coding of Wideband Audio with Multipulse LPC // Proc. of EUSIPCO, Edinburgh, September, (1994), pp.1453–1456
Google Scholar
A. Ubale, A. Gersho: A Low-Delay Wideband Speech Coder at 24 kbps // Proc. of ICASSP, Seatle, (1998), pp.165–168
Google Scholar
Alexis Bernard, Abeer Alwan: Perceptually Based and Embedded Wideband CELP Coding of Speech // Proc. of Eurospeech, (1999), pp.1543–1546
Google Scholar
A. Makur: Derivation of Subband Coding Gain: The Most General Case, www.ntu.edu.sg/home/eamakur/codinggain.pdf
Google Scholar
E. Zwicker, H. Fastl: Psychoacoustics Facts and Models, Springer-Verlag, Berlin Heidelberg, (1990)
Google Scholar
Linde, Y., Buzo, A., and Gray, R.M.: An algorithm for vector quantizer design // IEEE Transactions on Communications, vol. COM-28, Jan. (1980), pp. 84–95
Article Google Scholar
Gersho, A. and Gray, R.: Vector quantization and signal compression, Boston, Kluwer Academic Publishers, (1992)
MATH Google Scholar
ISO/IEC JTC1/SC29/WG11, MPEG, International Standard IS 13818-3 Information technology — Generic Coding of Moving Pictures and Associated Audio, (1994)
Google Scholar
M.Z. Livshitz, M. Parfieniuk, A.A. Petrovsky: Multistage Vector Quantization with Variable Dimension in Perceptual Speech Encoders with Psychoacoustic Model Based on Warped DFT // Proc. of the 7th International Conference on Digital Signal Processing and its Applications, vol.VII-1, Moscow, Russia, (2005), pp.187–191
Google Scholar
M.Z. Livshitz, M. Parfieniuk, A.A. Petrovsky: Wideband CELP-coder with Multiband Excitation and Multistage Quantization under Reconfigurable Structure Codebook // Digital Signal Processing, Moscow, Russia, vol.2, (2005), pp.20–35 (in Russian)
Google Scholar
A. Petrovsky, M. Parfieniuk, K. Bielawski: Psychoacoustically Motivated Non-uniform Cosine Modulated Polyphase Filter Bank // 2nd International Workshop on Spectral Methods and Multirate Signal Processing (SMMSP 2002), Toulouse, France, September 7–8, (2002), pp.95–101
Google Scholar
Michael Livshitz and Alexander Petrovsky: Perceptually Constrained Variable Bitrate Wideband Speech Coder // The Proc. of EUROCON, Serbia & Montenegro, Belgrade, November 22–24, (2005), pp.1296–1299
Google Scholar
DARPA TIMIT Acoustic-Phonetic Continuous Speech Corpus, Department of Commerce, NIST, Springfield, Virginia, Oct. (1990)
Google Scholar

Download references

Author information

Authors and Affiliations

Computer Engineering Department, Belarusian State University of Informatics and Radioelectronics, P.Brovky, 6, Minsk, Belarus, 220027
Michael Livshitz
University of Finance and Management in Bialystok branch in Elk, Grunwaldzka 1, Elk, Poland
Alexander Petrovsky

Authors

Michael Livshitz
View author publications
You can also search for this author in PubMed Google Scholar
Alexander Petrovsky
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Faculty of Computer Science, Bialystok Technical University, Wiejska 45A, 15-351, Bialystok, Poland
Khalid Saeed
Faculty of Computer Science, Szczecin University of Technology, Zolnierska 49, 71 210, Szczecin, Poland
Jerzy Pejaś
University of Finance and Management in Bialystok, Ciepla 40, 15 472, Bialystok, Poland
Romuald Mosdorf

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Livshitz, M., Petrovsky, A. (2006). Synthesis of Codebooks with Perceptually Monitored Structure for Multiband CELP-Coders. In: Saeed, K., Pejaś, J., Mosdorf, R. (eds) Biometrics, Computer Security Systems and Artificial Intelligence Applications. Springer, Boston, MA. https://doi.org/10.1007/978-0-387-36503-9_6

Download citation

DOI: https://doi.org/10.1007/978-0-387-36503-9_6
Publisher Name: Springer, Boston, MA
Print ISBN: 978-0-387-36232-8
Online ISBN: 978-0-387-36503-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics