Abstract
Low-rate speech coding technology has recently made a significant progress with the introduction of new interpolative algorithms (Shoham, 1993; Kleijn and Haagen, 1995). The inherent complexity of these algorithms is, however too high too be commercially useful for low-cost applications. In this paper we propose new approaches to low-complexity speech coding at coding rates of 1.2 and 2.4 kbps. The proposed methods utilize all the advantages of interpolatve coding but greatly simplify the analysis and synthesis operations to a point where low-cost two-way digital speech communication can be easily implemented on DSP or host platforms. At 2.4 kbps, the complexity of the proposed coder is about 7.5 and 2.5 MFLOPS for the encoder and decoder, respectively. At 1.2 kbps, the complexity is about 6 and 2.3 MFLOPS for the encoder and decoder, respectively. The small computational load of these coders make them suitable for multi-tasking environment and low-cost terminals. Informal subjective evaluation shows that, at 2.4 kbps, good communication quality is obtained. Communication quality is less than toll quality but the perceived coding effects are not annoying and do not prevent long sustained two-way conversation with high degree of intelligibility. The quality does not significantly degrade at 1.2 kbps and it is considered sufficient for messaging applications.
Similar content being viewed by others
References
Burnett, I.S. and Bradley, G.J. (1995). New techniques for multiprototype waveform coding at 2.84 kbps.Proc. ICASSP'95. pp. 261–264.
Chen, J.H. and Gersho, A. (1995). Adaptive postfiltering for quality enhancement of coded speech.IEEE Trans. Speech and Audio Processing.3:59–71.
Hardwick, J.C. and Lim, J.S. (1991). The application of the IMBE speech coder to mobile communication.Proc. ICASSP'91, pp. 249–252.
Hou, H. and Andrews, H.C. (1978). Cubic splines for image interpolation and digital filtering.IEEE Trans. on Acoust. Sp. & Sig. Proc. ASSP-26, 6:508–517.
Kleijn, W.B. and Haagen, J. (1994). Transformation and decomposition of the speech signal for coding.IEEE Signal Processing Leiters, 1(9): 136–138.
Kleijn, W.B. and Haagen, J. (1995a). A speech coder based on decomposition of characteristic waveforms.Proc. ICASSP'95, pp. 508–511.
Kleijn, W.B. and Haagen. J. (1995b). Waveform interpolation for coding and synthesis. In W.B. Kleijn and K.K. Paliwal (Eds.),Speech Coding and Synthesis. Elsevier.
Kleijn, W.B., Shoham, Y., Sen, D. and Hagen, R. (1996). A low-complexity waveform interpolation coder.Proc. ICASSP'96, pp. 212–215.
McCree, A. and Barnwell, T.P. (1995). A mixed excitation LPC vocoder model for low bit rate speech coding.IEEE Trans. Speech and Audio Proc., 3(4):242–250.
Pham, D.H. and Burnett, I.S. (1996). Quantisation techniques for prototype waveforms.Proc. Int. Symp. Sig. Proces, and App., ISSPA, pp. 53–56.
Shoham, Y. (1993a). High-quality speech coding at 2.4 to 4.0 kbps based on time-frequency interpolation.Proc. ICASSP'93, pp. III67–III70.
Shoham, Y. (1993b). High-quality speech coding at 2.4 kbps based on time-frequency interpolation. Proc.Eurospeech'93, pp. 741–744.
Shoham, Y. (1997). Emph very low complexity interpolative speech coding at 1.2 to 2.4 kbps.Proc. ICASSP'97, 2:1599–1602.
Unser, M., Aldroubi, A., and Eden, M. (1993a). B-spline signal processing: Part I—Theory.IEEE Trans. on Sig. Proc., 41(2):821–833.
Unser, M., Aldroubi, A., and Eden, M. (1993b). B-spline signal processing: Part II—Efficient design.IEEE Trans. on Sig. Proc., 41(2):834–848.
Zhou, J., Shoham, Y., and Akansu, A. (1996). Simple fast vector quantization of the line spectral frequencies.Proc. ICSLP'96, 2:945–948 (also available on CDROM).
Author information
Authors and Affiliations
Rights and permissions
About this article
Cite this article
Shoham, Y. Low complexity speech coding at 1.2 to 2.4 kbps based on waveform interpolation. Int J Speech Technol 2, 329–341 (1999). https://doi.org/10.1007/BF02108648
Received:
Revised:
Issue Date:
DOI: https://doi.org/10.1007/BF02108648