Abstract
Speech enhancement primarily focuses on improving the intelligibility and quality of the speech signal by using various algorithms and techniques. Processing of a speech signal refers to applying efficient mechanisms to reduce noise in the way of extracting the intended speech signal from the corrupted signal. Noise reduction techniques such as kalman filtering, spectral subtraction and adaptive wiener filtering etc. are used in different enhancement scenarios in speech processing. In the proposed method, the combination of wiener filter and Karhunen–Loéve Transform is used to remove noise and enhance the noisy speech signal. This paper presents the performance evaluation of the proposed hybrid algorithm by estimating Signal to Noise Ratio, Perceptual Evaluation of Speech Quality, Short-Time Objective Intelligibility and Extended STOI values. This algorithm has been implemented in varied noisy conditions and the results proved the fruitfulness of this method. Subjective listening evaluation is also done and both the objective and subjective results confirmed the significant improvement in speech intelligibility in the proposed method.
Similar content being viewed by others
References
Akbacak, M., & Hansen, J. H. L. (2007). Environmental sniffing: Noise knowledge estimation for robust speech systems. IEEE Transactions on Audio, Speech and Language Processing, 15(2), 465–477.
Chandra Sekhar, G. V. P., Anand Krishna, B., & Kamraju, M. (2014). Performance of wiener filter and adaptive filter for noise cancellation in real-time environment. International Journal of Computer Applications, 97(15), 16–23.
Gustafson, T. (1998). Instrumental variable subspace tracking using projection approximation. IEEE Transactions and Signal Processing, 46(3), 669–681.
Hansen, J. H., & Varadarajan, V. (2009). Analysis and compensation of Lombard speech across noise type and levels with application to in-set/out-of-set speaker recognition. IEEE Transactions on Audio, Speech and Language Processing, 17(2), 366–378.
Hendriks, R. C., Crespo, J. B., Jensen, J., & Taal, C. H. (2015). Optimal near-end speech intelligibility improvement incorporating additive noise and late reverberation under an approximation of the short-time SII. IEEE Transactions on Audio, Speech and Language Processing, 23(5), 851–862.
Krishnamurthy, N., & Hansen, J. H. (2009). Babble noise modeling, analysis, and applications. IEEE Transactions on Audio, Speech and Language Processing, 17(7), 1394–1407.
Liu, Y., Nower, N., Morita, S., & Unoki, M. (2016). Speech enhancement of instantaneous amplitude and phase for applications in noisy reverberant environments. Speech Communication, 84, 1–14.
Lu, X., Matsuda, S., Hori, C., & Kashioka, H. (2012). Speech restoration based on deep learning auto encoder with layer-wised pretraining. In Proc. INTERSPEECH (pp. 1504–1507).
Lu, X., Tsao, Y., Matsuda, S., & Hori, C. (2013). Speech enhancement based on deep denoising auto encoder. In Proc. INTERSPEECH (pp. 436–440).
Lu, X., Tsao, Y., Matsuda, S., & Hori, C. (2014) Ensemble modeling of denoising autoencoder for speech spectrum restoration. In Proc. INTERSPEECH (Vol. 14, pp. 885–889).
Manohar, K., & Rao, P. (2005). Speech enhancement in nonstationary noise environments using noise properties. Journal on Speech Communication, 48, 96–109.
Mowlaee, P., Saeidi, R., Christensen, M. G., & Martin, R. (2012). Subjective and objective quality assessment of single-channel speech separation algorithms. In IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (pp. 69–72).
Paliwal, K. K, Lyons, J. G., So, S., Stark, A. P., & Wójcicki, K. K. (2010). Comparative evaluation of speech enhancement methods for robust automatic speech recognition. In International Conference on Signal Processing and Communication Systems, Gold Coast, Australia, 1–5, ICSPCS.
Rajani, A., & Soundarya, S. V. S. (2016). A review on various speech enhancement techniques. International Journal of Advanced Research in Computer and Communication Engineering, 5(8), 296–301.
Rezayee, A., & Gazor, S. (2001). An adaptive KLT approach for speech enhancement. IEEE Transactions on Speech and Audio Processing, 9(2), 87–95.
Solo, V., & Kong, X. (1998). Performance analysis of adaptive eigen analysis algorithms. IEEE Transactions and Signal Processing, 46(3), 636–646.
Sun, P., Xu, J., & Qin, J. (2017). Semi-supervised speech enhancement in envelop and details subspaces, arXiv preprint, arXiv:1609.09443.
Taal, C. H., Hendriks, R. C., Heusdens, R., & Jensen, J. (2011). An algorithm for intelligibility prediction of time–frequency weighted noisy speech. IEEE Transactions on Audio, Speech and Language Processing, 19(7), 2125–2136.
Yang, B. (1995). Projection approximation subspace tracking. IEEE Transactions and Signal Processing, 43(1), 95–107.
Yi, Hu, & Loizou, P. C. (2003). A generalized subspace approach for enhancing speech corrupted by colored noise. IEEE Transactions on Speech and Audio Processing, 11(4), 334–341.
Zheng, N., & Zhang, X. L. (2019). Phase-aware speech enhancement based on deep neural network. IEEE/ACM Transactions on Audio, Speech, and Language Processing, 27(1), 63–76.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Srinivasarao, V., Ghanekar, U. Speech intelligibility enhancement: a hybrid wiener approach. Int J Speech Technol 23, 517–525 (2020). https://doi.org/10.1007/s10772-020-09737-4
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10772-020-09737-4