Skip to main content
Log in

Iterative-processed multiband speech enhancement for suppressing musical sounds

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

A multiband spectral subtraction (MBSS) processing step transforms background noise into annoying musical sounds. The paper proposes an iterative-processed multiband speech enhancement (IP-MBSE) post-processing method for suppressing musical sounds in enhanced speech recordings. In the proposed technique, the outturn of the MBSS processing is employed as an input for the subsequent iteration. The noise spectrum is estimated in each iteration, and the spectral subtraction is executed in each subband individually. The proposed method reduces musical sound even further by applying the estimated speech to the input and repeating the process. This procedure is repeated only a few times. The performance of the proposed technique, IP-MBSE, is measured using: (i) objective clarity measurements such as signal to noise ratio (SNR), segmental SNR (SegSNR), and perceptual evaluation of speech quality (PESQ), as well as (ii) subjective clarity metrics such as mean opinion score (MOS) and spectrogram at various SNR levels. The results of the IP-MBSE are compared with the conventional MBSS, and it is found that the IP-MBSE estimated speech is more pleasant for auditors.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1.
Fig. 2.
Fig. 3.
Fig. 4.
Fig. 5.
Fig. 6.
Fig. 7.
Fig. 8.
Fig. 9.
Fig. 10.

Similar content being viewed by others

References

  1. O’Shaughnessy D (2007) Speech Communications: Human and Machine, 2nd ed., Hyderabad, India: University Press (India) Pvt. Ltd.

  2. Ephraim Y (1992) Statistical-model-based speech enhancement systems. in Proceedings IEEE 80(10):1526–1555

    Article  Google Scholar 

  3. Loizou PC (2013) Speech Enhancement: Theory and Practice, IInd ed. Taylor and Francis

  4. Ephraim Y, Ari HL, Roberts W (2006) A brief survey of speech enhancement, in Electrical Engineering Handbook, 3rd ed. Boca Raton, FL: CRC

  5. Ephraim Y, Cohen I (2006) Recent advancements in speech enhancement, in The Electrical Engineering Handbook, CRC Press, ch. 5, pp. 12-26

  6. Lim JS, Oppenheim AV (1979) Enhancement and bandwidth compression of noisy speech. Proceedings IEEE 67:1586–1604

    Article  Google Scholar 

  7. Boll SF (1979) Suppression of acoustic noise in speech using spectral subtraction. IEEE Transaction Acoustic, Speech, Signal Processing 27(2):113–120

    Article  Google Scholar 

  8. Berouti M, Schwartz R, Makhoul J (1979) Enhancement of speech corrupted by acoustic noise, in Proceedings Int. Conf. Acoustic, Speech, Signal Processing, Washington DC, 208-211

  9. Kamath S, Loizou P (2002) A multi-band spectral subtraction method for enhancing speech corrupted by colored noise, in Proceedings Int. Conf. Acoustic, Speech, Signal Processing, Orlando, USA, May

  10. Upadhyay N, Karmakar A (2012) Single channel speech enhancement utilizing iterative processing of multi-band spectral subtraction algorithm, in Proceedings IEEE Int. Conf. Power, Control and Embedded System, MNNIT Allahabad, India, Dec. 17-19, 196-201

  11. Ogata S, Shimamura T (2001) Reinforced spectral subtraction method to enhance speech signal. in Proceedings Int. Conf. Electrical and Electronic Technology 1:242–245

    Article  Google Scholar 

  12. Li S, Wang J-Q, Niu M, Jing X-J, Liu T (2010) "Iterative spectral subtraction method for millimeter-wave conducted speech enhancement," J. Biomedical Science and Engineering 3:187–192

    Article  Google Scholar 

  13. A noisy speech corpus for assessment of speech enhancement algorithms. http://www.utdallas.edu/~loizou/speech/noizeus/

  14. "Perceptual evaluation of speech quality (PESQ), and objective method for end-to-end speech quality assessment of narrowband telephone networks and speech codecs," ITU, ITU-T Rec. P. 862, 2000.

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Navneet Upadhyay.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Upadhyay, N. Iterative-processed multiband speech enhancement for suppressing musical sounds. Multimed Tools Appl 83, 45423–45441 (2024). https://doi.org/10.1007/s11042-023-17336-z

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-023-17336-z

Keywords

Navigation