Abstract
In this paper, we propose a bandwidth extension (BWE) algorithm of a narrowband speech coder for music delivery services over IP networks. The proposed BWE algorithm is based on an embedded structure of using a baseline coder followed by an enhancement layer. To minimize the bit-rate increase by the enhancement layer, the proposed algorithm shares spectral envelope and excitation parameters between the baseline coder and the enhancement layer. In this paper, we choose the iLBC as the baseline coder and mel-frequency cepstral coefficients (MFCCs) are used to reconstruct higher frequency components at the enhancement layer. By doing this, the bit-rate of the proposed BWE coder is 15.45 kbit/s which is just 0.25 kbit/s higher than the iLBC. We compare the quality of the proposed BWE coder with that of the iLBC, and it is shown from an informal listening test that the proposed BWE coder provides significantly better quality than the iLBC for all four different kinds of music genres such as pop, classical, jazz and rock.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
ITU-T Recommendation G.729: Coding of speech at 8 kbit/s using conjugate-structure algebraic-code-excited linear-prediction (CS-ACELP) ITU (March 1996)
ITU-T Recommendation G.723.1: Dual rate speech coder for multimedia communications transmitting at 5.3 and 6.3 kbit/s ITU (March 1996)
ITU-T Recommendation G.728: Coding of speech at 16 kbit/s using low-delay code excited linear prediction, ITU (October 1992)
IETF RFC 3951, Internet Low Bit Rate Codec specification (December 2004)
Pan, D.Y.: Digital Audio Compression. Digital Technical Journal 5(2), 1–14 (1993)
Goode, B.: Voice Over Internet Protocol. Proc. IEEE 90, 1495–1517 (2002)
Andersen, S.V., Kleijn, W.B., Hagen, R., Linden, J., Murthi, M.N., Skoglund, J.: iLBC-A Linear Predictive Coder with Robustness to Packet Losses. In: Proc of IEEE 2002 Workshop on Speech Coding, Tsukuba, Japan, pp. 23–25 (2002)
Davis, S., Mermelstein, P.: Comparison of parametric representations for monosylable word recognition in continuously spoken sentences. IEEE Trans. Acoustic, Speech and Signal Processing 28, 357–366 (1980)
Kataoka, A., Kurihara, S., Sasaki, S., Hayashi, S.: A 16-kbit/s wideband speech codec scalable with G.729. In: Proc. Eurospeech, Rhodes, Greece, pp. 1491–1494 (1997)
Lee, G.H., Yoon, J.S., Kim, H.K.: A MFCC-based CELP speech coder for server-based speech recognition in network environments. In: Proc. of Eurospeech, Lisbon, Portugal, pp. 3169–3172 (2005)
Eriksson, T., Lindén, J., Skoglund, J.: Interframe LSF quantization for noisy channels. IEEE Trans. Speech Audio Process 7(5), 495–509 (1999)
Juang, B.H., Gray, A.H.: Multiple stage vector quantization for speech coding. In: Proc. of ICASSP, Paris, France, pp. 597–600 (May 1982)
EBU Tech Document 3253, Sound Quality Assessment Material (SQAM) (1988)
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 2007 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Lee, Y.H., Kim, H.K., Lee, M.S., Kim, D.Y. (2007). Bandwidth Extension of a Narrowband Speech Coder for Music Delivery over IP. In: Szczuka, M.S., et al. Advances in Hybrid Information Technology. ICHIT 2006. Lecture Notes in Computer Science(), vol 4413. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-77368-9_20
Download citation
DOI: https://doi.org/10.1007/978-3-540-77368-9_20
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-77367-2
Online ISBN: 978-3-540-77368-9
eBook Packages: Computer ScienceComputer Science (R0)