Speech Bandwidth Extension Aided by Magnitude Spectrum Data Hiding

Prasad, N.; Kishore Kumar, T.

doi:10.1007/s00034-017-0526-5

Speech Bandwidth Extension Aided by Magnitude Spectrum Data Hiding

Published: 01 March 2017

Volume 36, pages 4512–4540, (2017)
Cite this article

Circuits, Systems, and Signal Processing Aims and scope Submit manuscript

N. Prasad¹ &
T. Kishore Kumar¹

265 Accesses
6 Citations
Explore all metrics

Abstract

Public telephone systems transmit speech across a limited frequency range, about 300–3400 Hz, called narrowband (NB) which results in a significant reduction of quality and intelligibility of speech. This paper proposes a fully backward compatible novel method for bandwidth extension of NB speech. The method uses magnitude spectrum data hiding technique to provide a perceptually better wideband speech signal. The spectral envelope parameters are extracted from the down-sampled frequency shifted version of the high-frequency components of speech signal existing above NB, which are then encoded and spread by using spreading sequences, and are embedded in the low-amplitude high-frequency regions of the magnitude spectrum of NB speech signal. The embedded information is extracted at the receiving end to reconstruct the wideband speech signal. Theoretical and simulation analyses show that the proposed method is robust to quantization and channel noises. The comparison category rating listening and log spectral distortion tests clearly show that the reconstructed wideband signal gives a much better performance in terms of speech quality when compared to the conventional speech bandwidth extension methods employing data hiding.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

References

S. Andreas, P. Ted, A. Venkatraman, Audio Signal Processing and Coding (Wiley-Interscience Publication, USA, 2006)
P. Bauer, T. Fingscheidt, An HMM based artificial bandwidth extension evaluated by cross-language training and test, in Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), Las Vegas, NV, April 2008, pp. 4589–4592
S. Chen, H. Leung, Artificial bandwidth extension of telephony speech by data hiding, in Proceedings of IEEE International Symposium on Circuits and Systems (ISCAS 2005), Kobe, Japan, May 2005, pp. 3151–3154
S. Chen, H. Leung, Concurrent data transmission through analog speech channel using data hiding. IEEE Signal Process. Lett. 12(8), 581–584 (2005)
Article Google Scholar
S. Chen, H. Leung, Speech bandwidth extension by data hiding and phonetic classification, in Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), Honolulu, HI, April 2007, vol. 4 (2007), pp. 593–596
S. Chen, H. Leung, H. Ding, Telephony speech enhancement by data hiding. IEEE Trans. Instrum. Meas. 56(1), 63–74 (2007)
Article Google Scholar
Z. Chen, C. Zhao, G. Geng, F. Yin, An audio watermark based speech bandwidth extension method. EURASIP J. Audio Speech Music Process. 2013(10), 1–8 (2013)
Google Scholar
E.H. Dinan, E.H. Jabbari, Spreading codes for direct sequence CDMA and wideband CDMA cellular networks. IEEE Commun. Mag. 36(9), 48–54 (1998)
Article Google Scholar
H. Ding, Wideband audio over narrowband low-resolution media, in Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), Montreal, Quebec, Canada, March 2004, pp. 489–492
J. Epps, W.H. Holmes, A new technique for wideband enhancement of coded narrowband speech, in Proceedings of IEEE Workshop on Speech Coding, Porvoo, June 1999, pp. 174–176
European Telecommunications Standards Institute (ETSI) Standard, Speech Processing, Transmission and Quality Aspects (STQ); Distributed speech recognition; Front-end feature extraction algorithm; Compression algorithms, ETSI ES 201 108 V1.1.2, April 2000
W. Feller, An Introduction to Probability Theory and Its Applications, 3rd edn. (Wiley, New York, 1970)
MATH Google Scholar
J.S. Garofolo, Getting Started with the DARPA TIMIT CD-ROM: An Acoustic Phonetic Continuous Speech Database (National Institute of Standards and Technology (NIST), Gaithersburg, 1988)
Google Scholar
B. Geiser, P. Jax, P. Vary, Artificial bandwidth extension of speech supported by watermark-transmitted side information, in Proceedings of INTERSPEECH 2005, Lisbon, Portugal, September 2005, pp. 1497–1500
B. Geiser, P. Vary, Backwards compatible wideband telephony in mobile networks: CELP watermarking and bandwidth extension, in Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), Honolulu, HI, April 2007, vol 4 (2007), pp. 533–536
A. Goldsmith, Wireless Communications (Cambridge University Press, New York, 2005)
Book Google Scholar
E. Hansler, G. Schmidt, Speech and Audio Processing in Adverse Environments (Springer, Berlin, 2008)
Book Google Scholar
L. Hanzo, F.C.A. Somerville, J.P. Woodard, Voice Compression and Communications: Principles and Applications for Fixed and Wireless Channels (IEEE Press, Hoboken, 2001)
Book Google Scholar
International Telecommunications Union, Methods for subjective determination of transmission quality, ITU-T Recommendation P.800, August 1996
International Telecommunications Union, Software tools for speech and audio coding standardization, ITU-T Rec. G.191, September 2005
International Telecommunications Union, Perceptual evaluation of speech quality (PESQ): An objective method for end to-end speech quality assessment of narrow-band telephone networks and speech codecs, ITU-T Recommendation P.862, February 2001
International Telecommunications Union, Wideband extension to recommendation P.862 for the assessment of wideband telephone networks and speech codecs, ITU-T Recommendation P.862.2, November 2005
B. Iser, W. Minker, G. Schmidt, Bandwidth Extension of Speech Signals (Springer, New York, 2008)
Book MATH Google Scholar
P. Jax, Enhancement of bandlimited speech signals: algorithms and theoretical bounds. Ph.D. thesis, RWTH Aachen University, 2002
P. Jax, P. Vary, An upper bound on the quality of artificial bandwidth extension of narrowband speech signals, in Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), Orlando, FL, USA, May 2002, vol 1 (2002), pp. 237–240
P. Jax, P. Vary, On artificial bandwidth extension of telephone speech. Signal Process. 83(8), 1707–1719 (2003)
Article MATH Google Scholar
P. Jax, P. Vary, Bandwidth extension of speech signals: a catalyst for the introduction of wideband speech coding? IEEE Commun. Mag. 44(5), 106–111 (2006)
Article Google Scholar
Y. Linde, A. Buzo, R.M. Gray, An algorithm for vector quantizer design. IEEE Trans. Commun. 28(1), 84–95 (1980)
Article Google Scholar
Y. Nakatoh, M. Tsushima, T. Norimatsu, Generation of broadband speech from narrowband speech using piecewise linear mapping, in Proceedings of EUROSPEECH, Rhodes, Greece, September, 1997, pp. 1643–1646
M. Nilsson, W.B. Kleijn, Avoiding overestimation in bandwidth extension of telephony speech, in Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), Salt Lake City, UT, May 2001, vol 2 (2001), pp. 869–872
J.G. Proakis, Digital Communications, 2nd edn. (McGraw-Hill, New York, 1989)
MATH Google Scholar
H. Pulakka, P. Alku, Bandwidth extension of telephone speech using a neural network and a filter bank implementation for highband Melspectrum. IEEE Trans. Audio Speech Lang. Process. 19(7), 2170–2183 (2011)
Article Google Scholar
H. Pulakka, U. Remes, K. Palomaki, M. Kurimo, P. Alku, Speech bandwidth extension using gaussian mixture model-based estimation of the highband Mel spectrum, in Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), Prague, May 2011, pp. 5100–5103
Y. Qian, P. Kabal, Dual-mode wideband speech recovery from narrowband speech, in Proceedings of EUROSPEECH 2003, Geneva, September 2003, pp. 1433–1436
T. Rabie, D. Guerchi, Magnitude spectrum speech hiding, in Proceedings of IEEE International Conference on Signal Processing and Communications (ICSPC 2007), Dubai, November 2007, pp. 1147–1150
R. Hu, V. Krishnan, D.V. Anderson, Speech bandwidth extension by improved codebook mapping towards increased phonetic classification, in Proc. INTERSPEECH 2005, Lisbon, Portugal, September 2005, pp. 1501–1504
A.H. Sayed, Adaptive Filters (Wiley, Hoboken, 2008)
Book Google Scholar
W. Strange, T.R. Edman, J.J. Jenkins, Acoustic and phonological factors in vowel identification. J. Exp. Psychol. Hum. Percept. Perform. 5(4), 643–656 (1979)
Article Google Scholar
P. Vary, B. Geiser, Steganographic wideband telephony using narrowband speech codecs, in Proceedings of Asilomar Conference on Signals, Systems, and Computers (ACSSC 2007), Pacific Grove, CA, November 2007, pp. 1475–1479
S. Vaseghi, E. Zavarehei, Q. Yan, Speech bandwidth extension: Extrapolations of spectral envelop and harmonicity quality of excitation, in Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), Toulouse, May 2006, pp. 844–847

Download references

Author information

Authors and Affiliations

Department of Electronics and Communication Engineering, National Institute of Technology Warangal, Warangal, 506004, India
N. Prasad & T. Kishore Kumar

Authors

N. Prasad
View author publications
You can also search for this author in PubMed Google Scholar
T. Kishore Kumar
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to N. Prasad.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Prasad, N., Kishore Kumar, T. Speech Bandwidth Extension Aided by Magnitude Spectrum Data Hiding. Circuits Syst Signal Process 36, 4512–4540 (2017). https://doi.org/10.1007/s00034-017-0526-5

Download citation

Received: 05 July 2016
Revised: 15 February 2017
Accepted: 16 February 2017
Published: 01 March 2017
Issue Date: November 2017
DOI: https://doi.org/10.1007/s00034-017-0526-5

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Speech Bandwidth Extension Aided by Magnitude Spectrum Data Hiding

Abstract

Access this article

Similar content being viewed by others

Feature extraction using GTCC spectrogram and ResNet50 based classification for audio spoof detection

Adaptive attention mechanism for single channel speech enhancement

A Two-Stage Beamforming and Diffusion-Based Refiner System for 3D Speech Enhancement

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Speech Bandwidth Extension Aided by Magnitude Spectrum Data Hiding

Abstract

Access this article

Similar content being viewed by others

Feature extraction using GTCC spectrogram and ResNet50 based classification for audio spoof detection

Adaptive attention mechanism for single channel speech enhancement

A Two-Stage Beamforming and Diffusion-Based Refiner System for 3D Speech Enhancement

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation