Skip to main content
Log in

Bandwidth extension of telephone speech using magnitude spectrum data hiding

  • Published:
International Journal of Speech Technology Aims and scope Submit manuscript

Abstract

Public telephone systems transmit speech across a limited frequency range, about 300–3400 Hz, called narrowband (NB) which results in a significant reduction of quality and intelligibility of speech. This paper proposes a fully backward compatible novel method for bandwidth extension of NB speech. The method uses magnitude spectrum data hiding technique to provide a perceptually better wideband speech signal. Code excited linear prediction parameters are extracted from the down sampled frequency shifted version of the high frequency components of speech signal existing above NB, which are spread by using pseudo-noise codes, and are embedded in the low amplitude high-frequency regions of the magnitude spectrum of NB speech signal. The embedded information is extracted at the receiving end to reconstruct the wideband speech signal. Theoretical and simulation analyses show that the proposed method is robust to quantization and channel noises. The comparison category rating listening and log spectral distortion tests clearly show that the reconstructed wideband signal gives a much better performance in terms of speech quality when compared to some of the existing speech bandwidth extension methods employing data hiding.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

Similar content being viewed by others

References

  • Andreas, S., Ed, P. T., & Venkatraman, A. (2006). Audio signal processing and coding. New York: Wiley.

    Google Scholar 

  • Bauer, P., & Fingscheidt, T. (2008). An HMM based artificial bandwidth extension evaluated by cross-language training and test. In Proceedings of IEEE international conference on acoustics, speech, and signal processing (ICASSP), pp. 4589–4592.

  • Chen, S., & Leung, H. (2005). Artificial bandwidth extension of telephony speech by data hiding. In Proceedings of ISCAS, pp. 3151–3154.

  • Chen, S., & Leung, H. (2007). Speech bandwidth extension by data hiding and phonetic classification. In Proceedings of IEEE international conference on acoustics, speech, and signal processing (ICASSP), pp. 593–596.

  • Chen, S., & Leung, H. (2008). A bandwidth extension technique for signal transmission using chaotic data hiding. Circuits, Systems and Signal Processing, 27(6), 893–913.

    Article  MathSciNet  MATH  Google Scholar 

  • Chen, S., Leung, H., & Ding, H. (2007). Telephony speech enhancement by data hiding. IEEE Transactions on Instrumentation and Measurement, 56(1), 63–74.

    Article  Google Scholar 

  • Chen, Z., Zhao, C., Geng, G., & Yin, F. (2013). An audio watermark based speech bandwidth extension method. EURASIP Journal Audio, Speech and Music Processing, 2013(10), 1–8.

    Google Scholar 

  • Ding, H. (2004). Wideband audio over narrowband low-resolution media. In Proceedings of IEEE international conference on acoustics, speech, and signal processing (ICASSP), pp. 489–492.

  • Epps, J., & Holmes, W. H. (1999). A new technique for wideband enhancement of coded narrowband speech. In Proceedings of IEEE workshop on speech coding, pp. 174–176.

  • Erdmann, C., Vary, P., Fischer, K., Xu, W., Marke, M., Fingscheidt, T., Varga, I., Kaindl, M., Quinquis, C., Kovesi, B., & Massaloux, D. (2001). A candidate proposal for a 3GPP adaptive multi-rate wideband speechcodec. In Proceedings of IEEE international conference on acoustics, speech, and signal processing (ICASSP), pp. 757–760.

  • ETSI ES 201 108 V1.1.2. (2000). Speech processing, transmission and quality aspects (STQ); Distributed speech recognition; front-end feature extraction algorithm; compression algorithms.

  • Geiser, B., Jax, P., & Vary, P. (2005). Artificial bandwidth extension of speech supported by watermark-transmitted side information. In Proceedings of 9th European conference on speech communication and technology (INTERSPEECH), pp. 1497–1500.

  • Geiser, B., & Vary, P. (2007). Backwards compatible wideband telephony in mobile networks: CELP watermarking and bandwidth extension. In Proceedings of IEEE international conference on acoustics, speech, and signal processing (ICASSP), pp. 533–536.

  • GPP TS 26.171 (2001). AMR Wideband speech codec; general description, 3GPP.

  • Hassan, A. A., Hershey, J. E., & Saulnier, G. J. (1998). Perspectives in spread spectrum. Boston: Kluwer Academic Publishers.

    Book  Google Scholar 

  • ITU-T (1996). ITU-T recommendation P.800, methods for subjective determination of transmission quality.

  • ITU-T (2001). ITU-T Rec. P.862: Perceptual evaluation of speech quality (PESQ): An objective method for end to-end speech quality assessment of narrow-band telephone networks and speech codecs.

  • Jax, P. (2002). Enhancement of bandlimited speech signals: Algorithms and theoretical bounds. Ph.D. dissertation, RWTH Aachen University, Aachen, Germany.

  • Jax, P. (2004). Audio bandwidth extension: Application of psychoacoustics, signal processing and loudspeaker design. England: Wiley.

    Google Scholar 

  • Jax, P., & Vary, P. (2002). An upper bound on the quality of artificial bandwidth extension of narrowband speech signals. In Proceedings of IEEE international conference on acoustics, speech, and signal processing (ICASSP), pp. 237–240.

  • Jax, P., & Vary, P. (2003). On artificial bandwidth extension of telephone speech. Signal Processing, 83(8), 1707–1719.

    Article  MATH  Google Scholar 

  • Jax, P., & Vary, P. (2006). Bandwidth extension of speech signals: A catalyst for the introduction of wideband speech coding? IEEE Communication Magazine, 44(5), 106–111.

    Article  Google Scholar 

  • Keiser, B. E., & Strange, E. (1995). Digital telephony and network integration. New York: Van Nostrand Reinhold.

    Book  Google Scholar 

  • Nakatoh, Y., Tsushima, M., Norimatsu, T. (1997). Generation of broadband speech from narrowband speech using piecewise linear mapping. In Proceedings of EUROSPEECH, pp. 1643–1646.

  • NTT Adv. Technol. Corp. (1994). Multi-lingual speech database for telephonometry 1994.

  • Vary P., & Geiser, B. (2007). Steganographic wideband telephony using narrowband speech codecs. In Proceedings of conference record of asilomar conference on signals, systems, and computers, pp. 1475–1479.

  • Paulus, J., & Schnitzler, J. (1996). 16 kbit/s Wideband Speech Coding Based on Unequal Subbands. In Proceedings of IEEE international conference on acoustics, speech, and signal processing (ICASSP), pp. 651–654.

  • Pulakka, H., & Alku, P. (2011). Bandwidth extension of telephone speech using a neural network and a filter bank implementation for highband Melspectrum. IEEE Transactions on Audio, Speech and Language Processing, 19(7), 2170–2183.

    Article  Google Scholar 

  • Pulakka, H., Laaksonen, L., Vainio, M., Pohjalainen, J., & Alku, P. (2008). Evaluation of an artificial speech bandwidth extension method in three languages. IEEE Transactions on Audio, Speech and Language Processing, 16(6), 1124–1137.

    Article  Google Scholar 

  • Pulakka, H., Remes, U., Palomaki, K., Kurimo, M., & Alku, P. (2011). Speech bandwidth extension using gaussian mixture model-based estimation of the highband Mel spectrum. In Proceedings of IEEE international conference on acoustics, speech, and signal processing (ICASSP), pp. 5100–5103.

  • Qian, Y., & Kabal, P. (2003). Dual-mode wideband speech recovery from narrowband speech. In Proceedings of EUROSPEECH, pp. 1433–1436.

  • Rabie, T., & Guerchi, D. (2007). Magnitude spectrum speech hiding. In Proceedings of IEEE international conference on signal processing and communications, pp. 1147–1150.

  • Rongqiang, H. U., Venkatesh, K., & Anderson, D. V. (2005). Speech bandwidth extension by improved codebook mapping towards increased phonetic classification. In Proceedings of Interspeech, pp. 1501–1504.

  • Sayed, A. H. (2008). Adaptive filters. New Jersy: Wiley.

    Book  Google Scholar 

  • Schroeder, M. R., & Atal, B. S. (1985). Code-excited linear prediction (CELP); high quality at low bit rates. In Proceedings of IEEE international conference on acoustics, speech, and signal processing (ICASSP), pp. 937–940.

  • Strange, W., Edman, T. R., & Jenkins, J. J. (1979). Acoustic and phonological factors in vowel identification. Journal of Experimental Psychology: Human Perception and Performance, 5(4), 643–656.

    Google Scholar 

  • Vary, P., & Martin, R. (2006). Digital speech transmission: Enhancement, coding and error concealment. Chichester: Wiley.

    Book  Google Scholar 

  • Vaseghi, S., Zavarehei, E., & Yan, Q. (2006). Speech bandwidth extension: Extrapolations of spectral envelop and harmonicity quality of excitation. In Proceedings of IEEE international conference on acoustics, speech, and signal processing (ICASSP), pp. 844–847.

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Prasad Nizampatnam.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Nizampatnam, P., Tappeta, K.K. Bandwidth extension of telephone speech using magnitude spectrum data hiding. Int J Speech Technol 20, 151–162 (2017). https://doi.org/10.1007/s10772-016-9393-x

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10772-016-9393-x

Keywords

Navigation