Skip to main content
Log in

Speech intelligibility enhancement: a hybrid wiener approach

  • Published:
International Journal of Speech Technology Aims and scope Submit manuscript

Abstract

Speech enhancement primarily focuses on improving the intelligibility and quality of the speech signal by using various algorithms and techniques. Processing of a speech signal refers to applying efficient mechanisms to reduce noise in the way of extracting the intended speech signal from the corrupted signal. Noise reduction techniques such as kalman filtering, spectral subtraction and adaptive wiener filtering etc. are used in different enhancement scenarios in speech processing. In the proposed method, the combination of wiener filter and Karhunen–Loéve Transform is used to remove noise and enhance the noisy speech signal. This paper presents the performance evaluation of the proposed hybrid algorithm by estimating Signal to Noise Ratio, Perceptual Evaluation of Speech Quality, Short-Time Objective Intelligibility and Extended STOI values. This algorithm has been implemented in varied noisy conditions and the results proved the fruitfulness of this method. Subjective listening evaluation is also done and both the objective and subjective results confirmed the significant improvement in speech intelligibility in the proposed method.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8

Similar content being viewed by others

References

  • Akbacak, M., & Hansen, J. H. L. (2007). Environmental sniffing: Noise knowledge estimation for robust speech systems. IEEE Transactions on Audio, Speech and Language Processing, 15(2), 465–477.

    Article  Google Scholar 

  • Chandra Sekhar, G. V. P., Anand Krishna, B., & Kamraju, M. (2014). Performance of wiener filter and adaptive filter for noise cancellation in real-time environment. International Journal of Computer Applications, 97(15), 16–23.

    Article  Google Scholar 

  • Gustafson, T. (1998). Instrumental variable subspace tracking using projection approximation. IEEE Transactions and Signal Processing, 46(3), 669–681.

    Article  Google Scholar 

  • Hansen, J. H., & Varadarajan, V. (2009). Analysis and compensation of Lombard speech across noise type and levels with application to in-set/out-of-set speaker recognition. IEEE Transactions on Audio, Speech and Language Processing, 17(2), 366–378.

    Article  Google Scholar 

  • Hendriks, R. C., Crespo, J. B., Jensen, J., & Taal, C. H. (2015). Optimal near-end speech intelligibility improvement incorporating additive noise and late reverberation under an approximation of the short-time SII. IEEE Transactions on Audio, Speech and Language Processing, 23(5), 851–862.

    Article  Google Scholar 

  • Krishnamurthy, N., & Hansen, J. H. (2009). Babble noise modeling, analysis, and applications. IEEE Transactions on Audio, Speech and Language Processing, 17(7), 1394–1407.

    Article  Google Scholar 

  • Liu, Y., Nower, N., Morita, S., & Unoki, M. (2016). Speech enhancement of instantaneous amplitude and phase for applications in noisy reverberant environments. Speech Communication, 84, 1–14.

    Article  Google Scholar 

  • Lu, X., Matsuda, S., Hori, C., & Kashioka, H. (2012). Speech restoration based on deep learning auto encoder with layer-wised pretraining. In Proc. INTERSPEECH (pp. 1504–1507).

  • Lu, X., Tsao, Y., Matsuda, S., & Hori, C. (2013). Speech enhancement based on deep denoising auto encoder. In Proc. INTERSPEECH (pp. 436–440).

  • Lu, X., Tsao, Y., Matsuda, S., & Hori, C. (2014) Ensemble modeling of denoising autoencoder for speech spectrum restoration. In Proc. INTERSPEECH (Vol. 14, pp. 885–889).

  • Manohar, K., & Rao, P. (2005). Speech enhancement in nonstationary noise environments using noise properties. Journal on Speech Communication, 48, 96–109.

    Article  Google Scholar 

  • Mowlaee, P., Saeidi, R., Christensen, M. G., & Martin, R. (2012). Subjective and objective quality assessment of single-channel speech separation algorithms. In IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (pp. 69–72).

  • Paliwal, K. K, Lyons, J. G., So, S., Stark, A. P., & Wójcicki, K. K. (2010). Comparative evaluation of speech enhancement methods for robust automatic speech recognition. In International Conference on Signal Processing and Communication Systems, Gold Coast, Australia, 1–5, ICSPCS.

  • Rajani, A., & Soundarya, S. V. S. (2016). A review on various speech enhancement techniques. International Journal of Advanced Research in Computer and Communication Engineering, 5(8), 296–301.

    Google Scholar 

  • Rezayee, A., & Gazor, S. (2001). An adaptive KLT approach for speech enhancement. IEEE Transactions on Speech and Audio Processing, 9(2), 87–95.

    Article  Google Scholar 

  • Solo, V., & Kong, X. (1998). Performance analysis of adaptive eigen analysis algorithms. IEEE Transactions and Signal Processing, 46(3), 636–646.

    Article  Google Scholar 

  • Sun, P., Xu, J., & Qin, J. (2017). Semi-supervised speech enhancement in envelop and details subspaces, arXiv preprint, arXiv:1609.09443.

  • Taal, C. H., Hendriks, R. C., Heusdens, R., & Jensen, J. (2011). An algorithm for intelligibility prediction of time–frequency weighted noisy speech. IEEE Transactions on Audio, Speech and Language Processing, 19(7), 2125–2136.

    Article  Google Scholar 

  • Yang, B. (1995). Projection approximation subspace tracking. IEEE Transactions and Signal Processing, 43(1), 95–107.

    Article  Google Scholar 

  • Yi, Hu, & Loizou, P. C. (2003). A generalized subspace approach for enhancing speech corrupted by colored noise. IEEE Transactions on Speech and Audio Processing, 11(4), 334–341.

    Article  Google Scholar 

  • Zheng, N., & Zhang, X. L. (2019). Phase-aware speech enhancement based on deep neural network. IEEE/ACM Transactions on Audio, Speech, and Language Processing, 27(1), 63–76.

    Article  MathSciNet  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to V. Srinivasarao.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Srinivasarao, V., Ghanekar, U. Speech intelligibility enhancement: a hybrid wiener approach. Int J Speech Technol 23, 517–525 (2020). https://doi.org/10.1007/s10772-020-09737-4

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10772-020-09737-4

Keywords

Navigation