Skip to main content
Log in

Embedded bandwidth scalable wideband codec using hybrid matching pursuit harmonic/CELP scheme

  • Published:
Journal of Intelligent Manufacturing Aims and scope Submit manuscript

Abstract

Recent advances in speech coding have made wideband coding feasible at the bit-rates sufficient for mobile communication. Here we propose a novel hybrid harmocic Code Excited Linear Prediction (CELP) scheme for highband coding of band-split scalable wideband codec, where the low-band (0–4 kHz) is critically subsampled and coded selectively using existing narrowband codecs such as 5.4 kbps and 6.3 kbps G.723.1, 8 kbps G.729, and 11.8 kbps G.729E. The high-band signal is divided into stationary mode (SM) and non-stationary mode (NSM) components based on its unique characteristics. In the SM portion, the high-band signal is compressed using a multi-stage coding that combines the sinusoidal model and CELP. The first stage coding applies the damping factor matching pursuit (MP) algorithm without either the Over-Lap-Add (OLA) or smoothly interpolative synthesis schemes and the second stage utilizes CELP with the circular codebook. In the NSM portion, the high-band signals are coded by CELP with both pulse and circular codebooks by applying the complexity-reduced algorithm. To ensure scalability in highband coding, two enhancement layers are used to increase the number of pulses and control the quantizing sinusoidal parameter numbers. This paper describes the new algorithm and discuses novel techniques for efficient bandwidth wideband speech coding and subjective quality performance. For efficient bit allocation and enhanced performance, the pitch of the high-band codec is estimated using the quantized pitch parameter in low-band codec. An informal listening test, rated the subjective speech quality as comparable to that obtainable with G.722.2 as the fullband wideband codec and G.722.2 as the highband codec, the recent standardized band-split wideband codec.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Bilson F. A. (1973) On the influence of the number and phase of harmonics on the perceptibility of the pitch of complex signal. Acoustica 28: 60–65

    Google Scholar 

  • Ding Y., Qian X. (1997) Processing of musical tones using a combined quardratic polynomial-phase sinusoid and residual (QUASAR) signal model. Journal of the Audio Engineering Society 45: 571–585

    Google Scholar 

  • Dong H., Gibson J. D. (2006) Structures for SNR scalable speech coding. IEEE Transactions on Audio, Speech, and Language Processing 13: 545–557

    Article  Google Scholar 

  • Geiser B., Ragot S., Taddei H. (2008) Embedded speech coding from G.722 to G.729.1, in advances in disgtal speech transmission. John wiley and Sons. Ltd, New York, pp 201–247

    Google Scholar 

  • George E. B., Smith M. J. T. (1997) Speech analysis/synthesis and modification using an analysis-by-synthesis/overlap-add sinusoidal model. IEEE Transcations on Signal Processing 5: 389–406

    Google Scholar 

  • ITU-T Recommendation G.722. (1988). 7 kHz audio-coding within 64 kbit/s.

  • 3GPP TS 26.290. (2004). Extended AMR wideband codec; Transcoding function.

  • ITU-T Recommendation G.729.1. (2006). G.729 based embeded variable bit-rate coder: An 8–32 kbit/s scalable wideband coder bitstrem interoperable with G.729.

  • ITU-T Recommendation G.718. (2008). Frame error roburst narrowband and wideband embeded variable bit-rate coding of speech and audio from 8–32 kbit/s.

  • ITU-T Recommendation G.729. (1996). Coding of speech at 8 kbit/s using conjugate-structure algebraic-code-excited linear prediction (CS-ACELP).

  • ITU-T Recommendation. G.729 Annex E. (1998). 11.8kbit/s CS-ACELP speech coding algorithm.

  • ITU-T Recommendation. G.723.1. (1996). Dual rate speech coder for multimedia communications transmitting at 5.3 and 6.3 kbit/s.

  • ITU-T Recommendation P.862. (2001). Perceptual evaluation of speech quality (PESQ), an objective method for end-to-end speech quality assessment of narrowband telephone networks and speech codecs.

  • ITU-T Recommendation G.722.2. (2001). Wideband coding of speech at around 16 kbit/s using Adaptive Multi-Rate Wideband.

  • Javovo, R., Montagna, R., Perpsino, F., & Sereno, D. (1997). Some experiment of 7 kHz audio coding at 16 kbits/s. IEEE ICASSP, 1491–1494.

  • Jeong, G.-H., Kim, I.-D., Kim, B.-N., & Lee, I.-S. (2009). Bandwidth scalable wideband codec using hybrid matching pursuit HarmonicCELP scheme, IEEE, The 3rd international Workshop on Intelligent, Mobile and Intenet services in ubiquitous computing.

  • Jung, S.-K., Kini, K.-T., & Kang, H.-G. (2004). A bit-rate/bandwidth scalable speech coder based on ITU-T G.723.1 standard. IEEE ICASSP 285–288.

  • Kataoka, A., Kurihara, S., Sasaki, S., & Hayashi, S. (1989). A 16 kbps/sec wideband speech codec scalable with G.729. Proceedings of EUROSPEECH, 192–195.

  • Kim, K. T., Jung, S. K., Park, Y. C., & Youn, D. H. (2002). A new bandwidth scalable wideband speech/audio coder. IEEE ICASSP657–660.

  • Koishida, K., Cuperman, V., & Gersho, A. (2000). A 16-kbit/s bandwidth scalable audio coder based on the G.729 standard. IEEE ICASSP, 1149–1152.

  • Kovesi, B., Massaloux, D., & Sollaud, A. (2004). A scalable speech and audio coding scheme with continuous bitrate flexibility. IEEE ICASSP, 273–276.

  • Lupini P., & Cuperman, V. (1996). Nonsquare transform vector quantization, IEEE Signal Processing Letters, 3.

  • Mallet S. G. (1993) Zhifeng Zhang: Matching pursuit with time-frequency dictionaries. IEEE Transactions on Signal Processing 41: 3397–3415

    Article  Google Scholar 

  • McAulay R., Quatieri T. (1986) Speech analysis/synthesis based on a sinusoidal representation. IEEE Transactions on Signal Processing 34: 744–754

    Article  Google Scholar 

  • Ramprashad, S. A. (1998). A two stage hybid embedded speech/audio coding structure. IEEE ICASSP, 337–340.

  • Ramprashad S. A. (1999) Embeded coding using a mixed speech and audio coding paradigm. International Journal of Speech Technology 2: 359–372

    Article  Google Scholar 

  • Salami, R., Laflamme, C., Bessette, B., & Adoul, J. P. (1997). ITU-T G.729 Annex A; ITU-T G.729 Annex A: Reduced complexity 8 kb/s CS-ACELP codec for digital simultaneous voice and data, IEEE Communication Magazine.

  • Sun, X., Plante, F., Cheetham, B. M. G., & Wong, K. W. T. (1997). Phase modeling of speech excitation for low bit-rate sinusoidal transform coding. IEEE ICASSP1691–1694.

  • Wang S., Sekey A., Gersho A. (1992) An objective measure for predicting subjective quality of speech coders. Selected Areas in Communications. IEEE Journal 10(5): 819–829

    Article  Google Scholar 

  • You, J.-H., Park, C.-M., Lee, J.-I., Ahn, C.-B., Oh, S.-J., & Park, H. (2005). Magnitude-sign split quantization for bandwidth scalable wideband speech codec, In 6th Pacific-Rim conference on multimedia, (pp. 489–499). Lecture Notes in Computer Science, Berlin, Heidelberg, New York: Springer.

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to BoNam Kim.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Jeong, G., Sohn, S., Lim, J. et al. Embedded bandwidth scalable wideband codec using hybrid matching pursuit harmonic/CELP scheme. J Intell Manuf 23, 1315–1325 (2012). https://doi.org/10.1007/s10845-010-0414-3

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10845-010-0414-3

Keywords

Navigation