Abstract
Public telephone systems transmit speech across a limited frequency range, about 300–3400 Hz, called narrowband (NB) which results in a significant reduction of quality and intelligibility of speech. This paper proposes a fully backward compatible novel method for bandwidth extension of NB speech. The method uses magnitude spectrum data hiding technique to provide a perceptually better wideband speech signal. Code excited linear prediction parameters are extracted from the down sampled frequency shifted version of the high frequency components of speech signal existing above NB, which are spread by using pseudo-noise codes, and are embedded in the low amplitude high-frequency regions of the magnitude spectrum of NB speech signal. The embedded information is extracted at the receiving end to reconstruct the wideband speech signal. Theoretical and simulation analyses show that the proposed method is robust to quantization and channel noises. The comparison category rating listening and log spectral distortion tests clearly show that the reconstructed wideband signal gives a much better performance in terms of speech quality when compared to some of the existing speech bandwidth extension methods employing data hiding.
Similar content being viewed by others
References
Andreas, S., Ed, P. T., & Venkatraman, A. (2006). Audio signal processing and coding. New York: Wiley.
Bauer, P., & Fingscheidt, T. (2008). An HMM based artificial bandwidth extension evaluated by cross-language training and test. In Proceedings of IEEE international conference on acoustics, speech, and signal processing (ICASSP), pp. 4589–4592.
Chen, S., & Leung, H. (2005). Artificial bandwidth extension of telephony speech by data hiding. In Proceedings of ISCAS, pp. 3151–3154.
Chen, S., & Leung, H. (2007). Speech bandwidth extension by data hiding and phonetic classification. In Proceedings of IEEE international conference on acoustics, speech, and signal processing (ICASSP), pp. 593–596.
Chen, S., & Leung, H. (2008). A bandwidth extension technique for signal transmission using chaotic data hiding. Circuits, Systems and Signal Processing, 27(6), 893–913.
Chen, S., Leung, H., & Ding, H. (2007). Telephony speech enhancement by data hiding. IEEE Transactions on Instrumentation and Measurement, 56(1), 63–74.
Chen, Z., Zhao, C., Geng, G., & Yin, F. (2013). An audio watermark based speech bandwidth extension method. EURASIP Journal Audio, Speech and Music Processing, 2013(10), 1–8.
Ding, H. (2004). Wideband audio over narrowband low-resolution media. In Proceedings of IEEE international conference on acoustics, speech, and signal processing (ICASSP), pp. 489–492.
Epps, J., & Holmes, W. H. (1999). A new technique for wideband enhancement of coded narrowband speech. In Proceedings of IEEE workshop on speech coding, pp. 174–176.
Erdmann, C., Vary, P., Fischer, K., Xu, W., Marke, M., Fingscheidt, T., Varga, I., Kaindl, M., Quinquis, C., Kovesi, B., & Massaloux, D. (2001). A candidate proposal for a 3GPP adaptive multi-rate wideband speechcodec. In Proceedings of IEEE international conference on acoustics, speech, and signal processing (ICASSP), pp. 757–760.
ETSI ES 201 108 V1.1.2. (2000). Speech processing, transmission and quality aspects (STQ); Distributed speech recognition; front-end feature extraction algorithm; compression algorithms.
Geiser, B., Jax, P., & Vary, P. (2005). Artificial bandwidth extension of speech supported by watermark-transmitted side information. In Proceedings of 9th European conference on speech communication and technology (INTERSPEECH), pp. 1497–1500.
Geiser, B., & Vary, P. (2007). Backwards compatible wideband telephony in mobile networks: CELP watermarking and bandwidth extension. In Proceedings of IEEE international conference on acoustics, speech, and signal processing (ICASSP), pp. 533–536.
GPP TS 26.171 (2001). AMR Wideband speech codec; general description, 3GPP.
Hassan, A. A., Hershey, J. E., & Saulnier, G. J. (1998). Perspectives in spread spectrum. Boston: Kluwer Academic Publishers.
ITU-T (1996). ITU-T recommendation P.800, methods for subjective determination of transmission quality.
ITU-T (2001). ITU-T Rec. P.862: Perceptual evaluation of speech quality (PESQ): An objective method for end to-end speech quality assessment of narrow-band telephone networks and speech codecs.
Jax, P. (2002). Enhancement of bandlimited speech signals: Algorithms and theoretical bounds. Ph.D. dissertation, RWTH Aachen University, Aachen, Germany.
Jax, P. (2004). Audio bandwidth extension: Application of psychoacoustics, signal processing and loudspeaker design. England: Wiley.
Jax, P., & Vary, P. (2002). An upper bound on the quality of artificial bandwidth extension of narrowband speech signals. In Proceedings of IEEE international conference on acoustics, speech, and signal processing (ICASSP), pp. 237–240.
Jax, P., & Vary, P. (2003). On artificial bandwidth extension of telephone speech. Signal Processing, 83(8), 1707–1719.
Jax, P., & Vary, P. (2006). Bandwidth extension of speech signals: A catalyst for the introduction of wideband speech coding? IEEE Communication Magazine, 44(5), 106–111.
Keiser, B. E., & Strange, E. (1995). Digital telephony and network integration. New York: Van Nostrand Reinhold.
Nakatoh, Y., Tsushima, M., Norimatsu, T. (1997). Generation of broadband speech from narrowband speech using piecewise linear mapping. In Proceedings of EUROSPEECH, pp. 1643–1646.
NTT Adv. Technol. Corp. (1994). Multi-lingual speech database for telephonometry 1994.
Vary P., & Geiser, B. (2007). Steganographic wideband telephony using narrowband speech codecs. In Proceedings of conference record of asilomar conference on signals, systems, and computers, pp. 1475–1479.
Paulus, J., & Schnitzler, J. (1996). 16 kbit/s Wideband Speech Coding Based on Unequal Subbands. In Proceedings of IEEE international conference on acoustics, speech, and signal processing (ICASSP), pp. 651–654.
Pulakka, H., & Alku, P. (2011). Bandwidth extension of telephone speech using a neural network and a filter bank implementation for highband Melspectrum. IEEE Transactions on Audio, Speech and Language Processing, 19(7), 2170–2183.
Pulakka, H., Laaksonen, L., Vainio, M., Pohjalainen, J., & Alku, P. (2008). Evaluation of an artificial speech bandwidth extension method in three languages. IEEE Transactions on Audio, Speech and Language Processing, 16(6), 1124–1137.
Pulakka, H., Remes, U., Palomaki, K., Kurimo, M., & Alku, P. (2011). Speech bandwidth extension using gaussian mixture model-based estimation of the highband Mel spectrum. In Proceedings of IEEE international conference on acoustics, speech, and signal processing (ICASSP), pp. 5100–5103.
Qian, Y., & Kabal, P. (2003). Dual-mode wideband speech recovery from narrowband speech. In Proceedings of EUROSPEECH, pp. 1433–1436.
Rabie, T., & Guerchi, D. (2007). Magnitude spectrum speech hiding. In Proceedings of IEEE international conference on signal processing and communications, pp. 1147–1150.
Rongqiang, H. U., Venkatesh, K., & Anderson, D. V. (2005). Speech bandwidth extension by improved codebook mapping towards increased phonetic classification. In Proceedings of Interspeech, pp. 1501–1504.
Sayed, A. H. (2008). Adaptive filters. New Jersy: Wiley.
Schroeder, M. R., & Atal, B. S. (1985). Code-excited linear prediction (CELP); high quality at low bit rates. In Proceedings of IEEE international conference on acoustics, speech, and signal processing (ICASSP), pp. 937–940.
Strange, W., Edman, T. R., & Jenkins, J. J. (1979). Acoustic and phonological factors in vowel identification. Journal of Experimental Psychology: Human Perception and Performance, 5(4), 643–656.
Vary, P., & Martin, R. (2006). Digital speech transmission: Enhancement, coding and error concealment. Chichester: Wiley.
Vaseghi, S., Zavarehei, E., & Yan, Q. (2006). Speech bandwidth extension: Extrapolations of spectral envelop and harmonicity quality of excitation. In Proceedings of IEEE international conference on acoustics, speech, and signal processing (ICASSP), pp. 844–847.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Nizampatnam, P., Tappeta, K.K. Bandwidth extension of telephone speech using magnitude spectrum data hiding. Int J Speech Technol 20, 151–162 (2017). https://doi.org/10.1007/s10772-016-9393-x
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10772-016-9393-x