Skip to main content
Log in

Speech enhancement based on perceptual filter bank improvement

  • Published:
International Journal of Speech Technology Aims and scope Submit manuscript

Abstract

In this paper, a new denoising approach is presented based on perceptual analysis. The noisy signal is split by a gammatone filterbank with nonlinear frequency distributions according to ERB scale. The frequency masking threshold is calculated in each sub-band according to the Johnston model in the output of the Wiener filter. This threshold is then used in the gain function given by the perceptual filter. The evaluation tests are performed by using objective criterion including perceptual evaluation of speech quality as well as subjective criterion including mean opinion score. Obtained results show that the proposed method achieves best results in terms of quality and intelligibility of enhanced signal.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2

Similar content being viewed by others

References

  • Amehraye, A., Pastor, D., & Tamtaoui, A. (2008). Perceptual improvement of Wiener filtering. ICASSP’08 (pp. 2081–2084). Las Vegas, USA.

  • Berouti, M., Schwartz, R., & Makhoul, J. (1979). Enhancement of speech corrupted by acoustic noise. In Proceedings of International Conference on Acoustics, Speech, Signal Processing (pp. 208–211).

  • Ephraim, Y., & Malah, D. (1984). Speech enhancement using a minimum mean-square error short-time spectral amplitude estimator. IEEE Transaction on Acoustics Speech and Signal Processing, 32(6), 1109–1121.

    Article  Google Scholar 

  • Hirsch, H., & Pearce, D. (2000). The Aurora Experimental Framework for the Performance Evaluation of Speech Recognition Systems under Noisy Conditions. ISCA ITRW ASR2000 (pp. 18–20). Paris, France.

  • Hohmann, V. (2002). Frequency analysis and synthesis using a Gammatone filterbank. Acta Acustica United with Acustica, 88(3), 433–442.

    Google Scholar 

  • ITU-T recommendation P.800. (1996). Methods for subjective determination of transmission quality. Geneva: International Telecommunication Union.

  • ITU-T recommendation P.862. (2001). Perceptual evaluation of speech quality (PESQ), an objective method for end-to-end speech quality assessment of narrow-band telephone networks and speech codecs. Geneva: International Telecommunication Union.

  • Sohn, J., & Kim, N. (1999). Statistical model based voice activity detection. IEEE Signal Processing Letters, 6(1), 1–3.

    Google Scholar 

  • Johnston, J. D. (1988). Transform coding of audio signals using perceptual noise criteria. IEEE Journal of Selected Areas Commununication, 6, 314–323.

    Article  Google Scholar 

  • Loizou, P. (2007). Speech enhancement: Theory and practice. Boca Raton, FL: CRC Press.

    Google Scholar 

  • Moore, B. C. J., & Glasberg, B. R. (1996). A revision of Zwicker’s loudness model. Acta Acustica, 82, 335–345.

    Google Scholar 

  • O’Shaughnessy, D. (1987). Speech communication: Human and machine. New York, NY: Addison-Wesley.

  • Rao, C. V. R., Murthy, M. B. R., & Rao, K. S. (2011). Speech enhancement using cross-correlation compensated multi-band Wiener filter combined with constrained perceptual weighting filter. In IEEE Emerging Trends and Applications in Computer Science (pp. 1–6). Shillong, March 2011.

  • Rao, C. V. R., Murthy, M. B. R., & Rao, K. S. (2011). Speech enhancement using perceptual Wiener filter combined with unvoiced speech—A new scheme. In IEEE Recent Advances in Intelligent Computational Systems (RAICS) (pp. 688–691) Trivandrum, Sept 2011.

  • Strobach, P. (2000). Equirotational stack parameterization in subspace estimation and tracking. IEEE Transaction on Signal Processing, 48, 712–722.

    Article  Google Scholar 

  • Virag, N. (1999). Single channel speech enhancement based on masking properties of the human auditory system. IEEE Transaction on Speech and Audio Processing, 7, 126–137.

    Article  Google Scholar 

  • Zoghlami, N., Lachiri, Z., & Ellouze, N. (2009). Noise reduction based on perceptual speech analysis. In EURONOISE 8th European Conference on Noise Control (pp. 26–28). Edinburgh, Scotland.

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Sana Alaya.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Alaya, S., Zoghlami, N. & Lachiri, Z. Speech enhancement based on perceptual filter bank improvement. Int J Speech Technol 17, 253–258 (2014). https://doi.org/10.1007/s10772-014-9226-8

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10772-014-9226-8

Keywords

Navigation