Abstract
A multiband spectral subtraction (MBSS) processing step transforms background noise into annoying musical sounds. The paper proposes an iterative-processed multiband speech enhancement (IP-MBSE) post-processing method for suppressing musical sounds in enhanced speech recordings. In the proposed technique, the outturn of the MBSS processing is employed as an input for the subsequent iteration. The noise spectrum is estimated in each iteration, and the spectral subtraction is executed in each subband individually. The proposed method reduces musical sound even further by applying the estimated speech to the input and repeating the process. This procedure is repeated only a few times. The performance of the proposed technique, IP-MBSE, is measured using: (i) objective clarity measurements such as signal to noise ratio (SNR), segmental SNR (SegSNR), and perceptual evaluation of speech quality (PESQ), as well as (ii) subjective clarity metrics such as mean opinion score (MOS) and spectrogram at various SNR levels. The results of the IP-MBSE are compared with the conventional MBSS, and it is found that the IP-MBSE estimated speech is more pleasant for auditors.
Similar content being viewed by others
References
O’Shaughnessy D (2007) Speech Communications: Human and Machine, 2nd ed., Hyderabad, India: University Press (India) Pvt. Ltd.
Ephraim Y (1992) Statistical-model-based speech enhancement systems. in Proceedings IEEE 80(10):1526–1555
Loizou PC (2013) Speech Enhancement: Theory and Practice, IInd ed. Taylor and Francis
Ephraim Y, Ari HL, Roberts W (2006) A brief survey of speech enhancement, in Electrical Engineering Handbook, 3rd ed. Boca Raton, FL: CRC
Ephraim Y, Cohen I (2006) Recent advancements in speech enhancement, in The Electrical Engineering Handbook, CRC Press, ch. 5, pp. 12-26
Lim JS, Oppenheim AV (1979) Enhancement and bandwidth compression of noisy speech. Proceedings IEEE 67:1586–1604
Boll SF (1979) Suppression of acoustic noise in speech using spectral subtraction. IEEE Transaction Acoustic, Speech, Signal Processing 27(2):113–120
Berouti M, Schwartz R, Makhoul J (1979) Enhancement of speech corrupted by acoustic noise, in Proceedings Int. Conf. Acoustic, Speech, Signal Processing, Washington DC, 208-211
Kamath S, Loizou P (2002) A multi-band spectral subtraction method for enhancing speech corrupted by colored noise, in Proceedings Int. Conf. Acoustic, Speech, Signal Processing, Orlando, USA, May
Upadhyay N, Karmakar A (2012) Single channel speech enhancement utilizing iterative processing of multi-band spectral subtraction algorithm, in Proceedings IEEE Int. Conf. Power, Control and Embedded System, MNNIT Allahabad, India, Dec. 17-19, 196-201
Ogata S, Shimamura T (2001) Reinforced spectral subtraction method to enhance speech signal. in Proceedings Int. Conf. Electrical and Electronic Technology 1:242–245
Li S, Wang J-Q, Niu M, Jing X-J, Liu T (2010) "Iterative spectral subtraction method for millimeter-wave conducted speech enhancement," J. Biomedical Science and Engineering 3:187–192
A noisy speech corpus for assessment of speech enhancement algorithms. http://www.utdallas.edu/~loizou/speech/noizeus/
"Perceptual evaluation of speech quality (PESQ), and objective method for end-to-end speech quality assessment of narrowband telephone networks and speech codecs," ITU, ITU-T Rec. P. 862, 2000.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Upadhyay, N. Iterative-processed multiband speech enhancement for suppressing musical sounds. Multimed Tools Appl 83, 45423–45441 (2024). https://doi.org/10.1007/s11042-023-17336-z
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-023-17336-z