Abstract
ITU-T G.726 ADPCM has traditionally been considered a toll-quality coder and has been deployed ubiquitously in the public switched telephony network (PSTN). Currently, it is also being considered as the baseline means for carrying voice over connection-oriented packet networks, such as ATM and frame relay. At a high coding rate of 32 kbit/s, however, ITU-T G.726 ADPCM may still produce coded signals with annoying quantization noises. This paper proposes using two perceptually motivated approaches to enhance the performance of ITU-T G.726 ADPCM: (1)noise spectral shaping at the encoder; and (2)adaptive postfiltering at the decoder output. Based on listening experiments, we found that the combined system at various bit rates (16, 24, 32 kbit/s) consistently outperform G.726 ADPCM of the same bit rate. In particular, the combined system operating at 32 kbit/s consistently outperforms ITU-T 16 kbit/s G.728 LD-CELP. At 24 kbit/s, the performance is very close to G.728 LD-CELP and/or G.726 ADPCM at 32 kbit/s.
Similar content being viewed by others
References
Atal, B.S. (1982). Predictive coding of speech at low bit rates.IEEE Trans. Commun., COM-30:600–614.
Atal, B.S. and Schroeder, M.R. (1979). Predictive coding of speech and subjective error criteria.IEEE Trans. Acoust. Speech, Signal Process., ASSP-27:247–254.
Chen, J.-H. and Gersho, A. (1995). Adaptive postfiltering for quality enhancement of coded speech.IEEE Trans. Speech and Audio Processing, SAP-3:59–70.
Gerber, P. and Szajdecki., R. (1998). Private communication. (March).
Gerson, I. and Jasiuk, M. (1992). Techniques for improving the performance of CELP-type speech coders.IEEE J. Select. Areas Commun., JSAC-10:858–865.
ITU-T (1989). Recommendation G.48,Specification for An Intermediate Reference System. Geneva.
ITU-T (1990). Recommendation G.727,5-, 4-, 3-, and 2-bits/sample Embedded Adaptive Differential Pulse Code Modulation (ADPCM). Geneva.
ITU-T (1990b). Recommendation G.726,40, 32, 24, 16 kbit/s Adaptive Differential Pulse Code Modulation (ADPCM). Geneva.
ITU-T (1992). Recommendation G.728,Coding of Speech at 16 kbit/s Using Low-Delay Code Excited Linear Prediction. Geneva.
ITU-T (1993). Recommendation G.722,7 kHz Audio Coding Within 64 kbit/s. Geneva.
ITU-T (1996). Recommendation P.861,Objective Quality Measurement of Telephone-Band (300–3400 Hz) Speech Codecs. Geneva.
Kleijn, W.B. and Paliwal, K.K. (1995).Speech Coding and Synthesis. Amsterdam, The Netherlands: Elsevier Science B.V.
Malah, D. and Cox, R.V. (1982). A generalized comb filter technique for speech enhancement.Proc. IEEE ICASSP (April), pp. 160–163.
Makhoul, J. and Berouti, M. (1979). Adaptive noise spectral shaping and entropy coding in predictive coding of speech.IEEE Trans. Acoust. Speech, Signal Process.,ASSP-27:63–73.
Minoli, D. and Minoli, E. (1998).Delivering Voice over Frame Relay and ATM. New York: John Wiley & Sons, Inc.
Ramamoorthy, V., Jayant, N.S., Cox, R.V. and Sondhi, M.M. (1988). Enhancement of ADPCM speech coding with backward-adaptive algorithms for postfiltering and noise feedback.IEEE J. Select. Areas Commun., JSAC-6:364–382.
Schroeder, M.R. (1965). Patent U.S. No. 3180936, (April).
Schroeder, M.R., Atal, B.S., and Hall, J.L. (1979). “Optimizing digital speech coders by exploiting masking properties of the human ear.J. Acoust. Soc. Amer. 66:1647–1652.
Shoham, Y. (1998). Private communication (March).
Schwartz, M. (1987).Telecommunication Networks: Protocols, Modeling and Analysis. Reading, Massachusetts: Addison-Wesley.
Author information
Authors and Affiliations
Rights and permissions
About this article
Cite this article
Lee, C.C. An enhanced ADPCM coder for voice over packet networks. Int J Speech Technol 2, 343–357 (1999). https://doi.org/10.1007/BF02108649
Received:
Accepted:
Issue Date:
DOI: https://doi.org/10.1007/BF02108649