Bandwidth extension of telephone speech using magnitude spectrum data hiding

Nizampatnam, Prasad; Tappeta, Kishore Kumar

doi:10.1007/s10772-016-9393-x

Bandwidth extension of telephone speech using magnitude spectrum data hiding

Published: 13 January 2017

Volume 20, pages 151–162, (2017)
Cite this article

International Journal of Speech Technology Aims and scope Submit manuscript

Prasad Nizampatnam¹ &
Kishore Kumar Tappeta¹

223 Accesses
1 Citation
Explore all metrics

Abstract

Public telephone systems transmit speech across a limited frequency range, about 300–3400 Hz, called narrowband (NB) which results in a significant reduction of quality and intelligibility of speech. This paper proposes a fully backward compatible novel method for bandwidth extension of NB speech. The method uses magnitude spectrum data hiding technique to provide a perceptually better wideband speech signal. Code excited linear prediction parameters are extracted from the down sampled frequency shifted version of the high frequency components of speech signal existing above NB, which are spread by using pseudo-noise codes, and are embedded in the low amplitude high-frequency regions of the magnitude spectrum of NB speech signal. The embedded information is extracted at the receiving end to reconstruct the wideband speech signal. Theoretical and simulation analyses show that the proposed method is robust to quantization and channel noises. The comparison category rating listening and log spectral distortion tests clearly show that the reconstructed wideband signal gives a much better performance in terms of speech quality when compared to some of the existing speech bandwidth extension methods employing data hiding.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

References

Andreas, S., Ed, P. T., & Venkatraman, A. (2006). Audio signal processing and coding. New York: Wiley.
Google Scholar
Bauer, P., & Fingscheidt, T. (2008). An HMM based artificial bandwidth extension evaluated by cross-language training and test. In Proceedings of IEEE international conference on acoustics, speech, and signal processing (ICASSP), pp. 4589–4592.
Chen, S., & Leung, H. (2005). Artificial bandwidth extension of telephony speech by data hiding. In Proceedings of ISCAS, pp. 3151–3154.
Chen, S., & Leung, H. (2007). Speech bandwidth extension by data hiding and phonetic classification. In Proceedings of IEEE international conference on acoustics, speech, and signal processing (ICASSP), pp. 593–596.
Chen, S., & Leung, H. (2008). A bandwidth extension technique for signal transmission using chaotic data hiding. Circuits, Systems and Signal Processing, 27(6), 893–913.
Article MathSciNet MATH Google Scholar
Chen, S., Leung, H., & Ding, H. (2007). Telephony speech enhancement by data hiding. IEEE Transactions on Instrumentation and Measurement, 56(1), 63–74.
Article Google Scholar
Chen, Z., Zhao, C., Geng, G., & Yin, F. (2013). An audio watermark based speech bandwidth extension method. EURASIP Journal Audio, Speech and Music Processing, 2013(10), 1–8.
Google Scholar
Ding, H. (2004). Wideband audio over narrowband low-resolution media. In Proceedings of IEEE international conference on acoustics, speech, and signal processing (ICASSP), pp. 489–492.
Epps, J., & Holmes, W. H. (1999). A new technique for wideband enhancement of coded narrowband speech. In Proceedings of IEEE workshop on speech coding, pp. 174–176.
Erdmann, C., Vary, P., Fischer, K., Xu, W., Marke, M., Fingscheidt, T., Varga, I., Kaindl, M., Quinquis, C., Kovesi, B., & Massaloux, D. (2001). A candidate proposal for a 3GPP adaptive multi-rate wideband speechcodec. In Proceedings of IEEE international conference on acoustics, speech, and signal processing (ICASSP), pp. 757–760.
ETSI ES 201 108 V1.1.2. (2000). Speech processing, transmission and quality aspects (STQ); Distributed speech recognition; front-end feature extraction algorithm; compression algorithms.
Geiser, B., Jax, P., & Vary, P. (2005). Artificial bandwidth extension of speech supported by watermark-transmitted side information. In Proceedings of 9th European conference on speech communication and technology (INTERSPEECH), pp. 1497–1500.
Geiser, B., & Vary, P. (2007). Backwards compatible wideband telephony in mobile networks: CELP watermarking and bandwidth extension. In Proceedings of IEEE international conference on acoustics, speech, and signal processing (ICASSP), pp. 533–536.
GPP TS 26.171 (2001). AMR Wideband speech codec; general description, 3GPP.
Hassan, A. A., Hershey, J. E., & Saulnier, G. J. (1998). Perspectives in spread spectrum. Boston: Kluwer Academic Publishers.
Book Google Scholar
ITU-T (1996). ITU-T recommendation P.800, methods for subjective determination of transmission quality.
ITU-T (2001). ITU-T Rec. P.862: Perceptual evaluation of speech quality (PESQ): An objective method for end to-end speech quality assessment of narrow-band telephone networks and speech codecs.
Jax, P. (2002). Enhancement of bandlimited speech signals: Algorithms and theoretical bounds. Ph.D. dissertation, RWTH Aachen University, Aachen, Germany.
Jax, P. (2004). Audio bandwidth extension: Application of psychoacoustics, signal processing and loudspeaker design. England: Wiley.
Google Scholar
Jax, P., & Vary, P. (2002). An upper bound on the quality of artificial bandwidth extension of narrowband speech signals. In Proceedings of IEEE international conference on acoustics, speech, and signal processing (ICASSP), pp. 237–240.
Jax, P., & Vary, P. (2003). On artificial bandwidth extension of telephone speech. Signal Processing, 83(8), 1707–1719.
Article MATH Google Scholar
Jax, P., & Vary, P. (2006). Bandwidth extension of speech signals: A catalyst for the introduction of wideband speech coding? IEEE Communication Magazine, 44(5), 106–111.
Article Google Scholar
Keiser, B. E., & Strange, E. (1995). Digital telephony and network integration. New York: Van Nostrand Reinhold.
Book Google Scholar
Nakatoh, Y., Tsushima, M., Norimatsu, T. (1997). Generation of broadband speech from narrowband speech using piecewise linear mapping. In Proceedings of EUROSPEECH, pp. 1643–1646.
NTT Adv. Technol. Corp. (1994). Multi-lingual speech database for telephonometry 1994.
Vary P., & Geiser, B. (2007). Steganographic wideband telephony using narrowband speech codecs. In Proceedings of conference record of asilomar conference on signals, systems, and computers, pp. 1475–1479.
Paulus, J., & Schnitzler, J. (1996). 16 kbit/s Wideband Speech Coding Based on Unequal Subbands. In Proceedings of IEEE international conference on acoustics, speech, and signal processing (ICASSP), pp. 651–654.
Pulakka, H., & Alku, P. (2011). Bandwidth extension of telephone speech using a neural network and a filter bank implementation for highband Melspectrum. IEEE Transactions on Audio, Speech and Language Processing, 19(7), 2170–2183.
Article Google Scholar
Pulakka, H., Laaksonen, L., Vainio, M., Pohjalainen, J., & Alku, P. (2008). Evaluation of an artificial speech bandwidth extension method in three languages. IEEE Transactions on Audio, Speech and Language Processing, 16(6), 1124–1137.
Article Google Scholar
Pulakka, H., Remes, U., Palomaki, K., Kurimo, M., & Alku, P. (2011). Speech bandwidth extension using gaussian mixture model-based estimation of the highband Mel spectrum. In Proceedings of IEEE international conference on acoustics, speech, and signal processing (ICASSP), pp. 5100–5103.
Qian, Y., & Kabal, P. (2003). Dual-mode wideband speech recovery from narrowband speech. In Proceedings of EUROSPEECH, pp. 1433–1436.
Rabie, T., & Guerchi, D. (2007). Magnitude spectrum speech hiding. In Proceedings of IEEE international conference on signal processing and communications, pp. 1147–1150.
Rongqiang, H. U., Venkatesh, K., & Anderson, D. V. (2005). Speech bandwidth extension by improved codebook mapping towards increased phonetic classification. In Proceedings of Interspeech, pp. 1501–1504.
Sayed, A. H. (2008). Adaptive filters. New Jersy: Wiley.
Book Google Scholar
Schroeder, M. R., & Atal, B. S. (1985). Code-excited linear prediction (CELP); high quality at low bit rates. In Proceedings of IEEE international conference on acoustics, speech, and signal processing (ICASSP), pp. 937–940.
Strange, W., Edman, T. R., & Jenkins, J. J. (1979). Acoustic and phonological factors in vowel identification. Journal of Experimental Psychology: Human Perception and Performance, 5(4), 643–656.
Google Scholar
Vary, P., & Martin, R. (2006). Digital speech transmission: Enhancement, coding and error concealment. Chichester: Wiley.
Book Google Scholar
Vaseghi, S., Zavarehei, E., & Yan, Q. (2006). Speech bandwidth extension: Extrapolations of spectral envelop and harmonicity quality of excitation. In Proceedings of IEEE international conference on acoustics, speech, and signal processing (ICASSP), pp. 844–847.

Download references

Author information

Authors and Affiliations

Department of Electronics and Communication Engineering, National Institute of Technology Warangal, Warangal, Telangana, 506 004, India
Prasad Nizampatnam & Kishore Kumar Tappeta

Authors

Prasad Nizampatnam
View author publications
You can also search for this author in PubMed Google Scholar
Kishore Kumar Tappeta
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Prasad Nizampatnam.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Nizampatnam, P., Tappeta, K.K. Bandwidth extension of telephone speech using magnitude spectrum data hiding. Int J Speech Technol 20, 151–162 (2017). https://doi.org/10.1007/s10772-016-9393-x

Download citation

Received: 08 June 2016
Accepted: 16 December 2016
Published: 13 January 2017
Issue Date: March 2017
DOI: https://doi.org/10.1007/s10772-016-9393-x

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Bandwidth extension of telephone speech using magnitude spectrum data hiding

Abstract

Access this article

Similar content being viewed by others

Feature extraction using GTCC spectrogram and ResNet50 based classification for audio spoof detection

Adaptive attention mechanism for single channel speech enhancement

A Two-Stage Beamforming and Diffusion-Based Refiner System for 3D Speech Enhancement

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Bandwidth extension of telephone speech using magnitude spectrum data hiding

Abstract

Access this article

Similar content being viewed by others

Feature extraction using GTCC spectrogram and ResNet50 based classification for audio spoof detection

Adaptive attention mechanism for single channel speech enhancement

A Two-Stage Beamforming and Diffusion-Based Refiner System for 3D Speech Enhancement

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation