Ideal binary masking for reducing convolutive noise

Saleem, Nasir; Mustafa, Ehtasham; Nawaz, Aamir; Khan, Adnan

doi:10.1007/s10772-015-9298-0

Ideal binary masking for reducing convolutive noise

Published: 14 August 2015

Volume 18, pages 547–554, (2015)
Cite this article

International Journal of Speech Technology Aims and scope Submit manuscript

Nasir Saleem¹,
Ehtasham Mustafa¹,
Aamir Nawaz¹ &
…
Adnan Khan¹

186 Accesses
8 Citations
Explore all metrics

Abstract

It is important to know the degree to which convolutive noise disrupts the perceptual aspects of speech and its intelligibility. This paper presents the ideal binary masking criterion for reducing the convolutive noise (reverberation) and to improve the quality and intelligibility of speech. The noise is suppressed using ideal binary time–frequency masking that is based on signal-to-reverberation ratio (SRR) of individual time–frequency channels. All T–F channels with the SRR greater than pre-selected threshold are retained while others are eliminated. The performance of algorithm is evaluated using IEEE sentences corrupted with different degrees of reverberation times (RT₆₀) ranging from 0.3 to 2.0 s. The results indicate that with the increase of reverberation time, the intelligibility and perceptual aspects of speech decrease. Additional analyses indicated that ideal binary masking reduced the temporary envelope spreading effect introduced by the reverberation. The algorithm is evaluated with perceptual evaluation of speech quality, SNR_LOSS, log-likelihood-ratio and frequency weighted segmental signal-to-noise ratio.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

References

Assmann, P. F., & Summerfield, Q. (2004). The perception of speech under adverse acoustic conditions. In S. Greenberg (Ed.), Speech processing in auditory system. A. N: W. A. Ainsworth.
Google Scholar
Bolt, R. H., & MacDonald, A. D. (1949). Theory of speech masking by reverberation. Journal of the Acoustic Society of America, 21, 577–580.
Article Google Scholar
Furuya, K., & Kataoka, A. (2007). Robust speech dereverberation using multichannel blind deconvolution with spectral subtraction. IEEE Transactions on Audio, Speech, and Language Processing, 15, 1579–1591.
Article Google Scholar
Grundlehner, B., Lecocq, J., Balan, R., & Rosca, J. (2005). Performance assessment method for speech enhancement. In Proceedings of 1st annual, IEEE.
Haykin, S. (2000). Unsupervised adaptive filtering: Blind de-convolution (Vol. 2, pp. 1–12). New York: Wiley.
Google Scholar
Huang, Y., Benesty, J., & Chen, J. (2007). De-reverberation. In J. Benesty, M. Sondhi, & Y. Huang (Eds.), Springer handbook of speech processing (pp. 929–943). New York: Springer.
Google Scholar
Kjellberg, A. (2004). Effects of reverberation time on the cognitive load in speech communication: Theoretical considerations. Noise Health, 7, 11–21.
Google Scholar
Kokkinakis, K., & Loizou, P. C. (2009). Selective-tap blind de-reverberation for two-microphone enhancement of reverberant speech. IEEE Signal Processing Letters, 16, 961–964.
Article Google Scholar
Krishnamoorthy, P., & Prasanna, S. R. (2009). Reverberant speech enhancement by temporal and spectral processing. IEEE Transactions on Audio, Speech, and Language Processing, 17, 253–266.
Article Google Scholar
Loizou, P. C. (2007). Speech enhancement: Theory and practice. In S. R. Quackenbush, T. P. Barnwell III, & M. A. Clement (Eds.), Objective—measures of speech quality (2nd ed.). Eaglewood Cliffs: Prentice Hall.
Google Scholar
Ma, J., & Loizou, P. C. (2011). SNR loss: A new objective measure for predicting speech intelligibility of noise-suppressed speech. Speech Communication, 53(3), 340–354.
Article Google Scholar
Miyoshi, M., & Kaneda, Y. (1988). Inverse filtering of room acoustics. IEEE Transactions on Speech and Audio Processing, 36, 145–152.
Article Google Scholar
Nabelek, A. K., & Dagenais, P. A. (1986). Vowel errors in noise and in reverberation by hearing-impaired listeners. Journal of the Acoustic Society of America, 80, 741–748.
Article Google Scholar
Nabelek, A. K., & Letowski, T. R. (1988). Similarities of vowels in non-reverberant and reverberant fields. Journal of the Acoustic Society of America, 83, 1891–1899.
Article Google Scholar
Nabelek, A. K., Letowski, T. R., & Tucker, F. M. (1989). Reverberant overlap and self-masking in consonant identification. Journal of the Acoustic Society of America, 86, 1259–1265.
Article Google Scholar
Nabelek, A. K., & Picket, J. M. (1974). Monaural and binaural speech perception through hearing aids under noise and reverberation with normal and hearing-impaired listeners. Journal of Speech and Hearing Research, 17, 724–739.
Article Google Scholar
Neuman, A. C., Wroblewski, M., Hajicek, J., & Rubinstein, A. (2010). Combined effects of noise and reverberation on speech recognition performance of normal-hearing children and adults. Ear and Hearing, 31, 336–344.
Article Google Scholar
Rix, A.W., Hollier, M. P., Hekstra, A. P. & Beerends, J. G. (2001). Perceptual evaluation of speech quality (PESQ).
Roman, N., & Woodruff, J. (2013). Speech intelligibility in reverberation with ideal binary masking: Effects of early reflections and signal-to-noise ratio threshold. Journal of the Acoustical Society of America, 133, 1707–1717.
Article Google Scholar

Download references

Author information

Authors and Affiliations

Institute of Engineering & Technology, Gomal University, D. I. Khan, 29050, Pakistan
Nasir Saleem, Ehtasham Mustafa, Aamir Nawaz & Adnan Khan

Authors

Nasir Saleem
View author publications
You can also search for this author in PubMed Google Scholar
Ehtasham Mustafa
View author publications
You can also search for this author in PubMed Google Scholar
Aamir Nawaz
View author publications
You can also search for this author in PubMed Google Scholar
Adnan Khan
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Nasir Saleem.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Saleem, N., Mustafa, E., Nawaz, A. et al. Ideal binary masking for reducing convolutive noise. Int J Speech Technol 18, 547–554 (2015). https://doi.org/10.1007/s10772-015-9298-0

Download citation

Received: 31 March 2015
Accepted: 03 August 2015
Published: 14 August 2015
Issue Date: December 2015
DOI: https://doi.org/10.1007/s10772-015-9298-0

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Ideal binary masking for reducing convolutive noise

Abstract

Access this article

Similar content being viewed by others

Binary mask based method for enhancement of mixed noise speech of low SNR input

Single channel noise reduction system in low SNR

Monaural Speech Enhancement Based on Multi-threshold Masking

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Ideal binary masking for reducing convolutive noise

Abstract

Access this article

Similar content being viewed by others

Binary mask based method for enhancement of mixed noise speech of low SNR input

Single channel noise reduction system in low SNR

Monaural Speech Enhancement Based on Multi-threshold Masking

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation