Abstract
Speech enhancement is very important step for improving quality and intelligibility of noisy speech signal. In practical environment more than one noise sources are present, hence it is necessary to design a technique/ algorithm that can remove mixed noises or more than one noises from single-channel speech signals. In this paper, a single channel speech enhancement method is proposed for reduction of mixed non-stationary noises. The proposed method is based on wavelet packet and ideal binary mask thresholding function for speech enhancement. Db10 mother wavelet packet transform is used for decomposition of speech signal in three levels. After decomposition of speech signal a binary mask threshold function is used to threshold the noisy coefficients from the noisy speech signal coefficients. The performance of the proposed wavelet with ideal mask method is compared with Wiener, Spectral Subtraction, p-MMSE, log-MMSE, Ideal channel selection, Ideal binary mask, hard and soft wavelet thresholding function in terms of PESQ, SNR improvement, Cepstral Distance, and frequency weighted segmental SNR. The proposed method has shown improved performance over conventional speech enhancement methods.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Loizou, P.C.: Speech enhancement theory and practice. CRC Press, USA (2007)
Boll, S.F.: Suppression of acoustic noise in speech using spectral subtraction. IEEE Trans. Acoustic, Speech, Signal Processing 113-120 (1979)
Krishnamurthy, P., Prasanna, S.R.M.: Modified spectral subtraction method for enhancement of noisy speech. In: Proc. 3rd International Conference on Intelligent Sensing and Information Processing, Bangalore, India, pp. 146–150 (2005)
Scalart, P., Filho, J.: Speech enhancement based on a priori signal to noise estimation. In: Proc. IEEE Int. Conf. on Acoust, Speech, Signal Processing, Atlanta, pp. 629–632 (1996)
Ephraim, Y., Malah, D.: Speech enhancement using a minimum mean-square error short-time spectral amplitude estimator. IEEE Trans. Audio, Speech, and Language Processing, 1109–1121 (1984)
Ephraim, Y., Malah, D.: Speech enhancement using a minimum mean square error log-spectral amplitude estimator. IEEE Trans. Audio, Speech, and Language Processing, 443–445 (1995)
Loizou, P.C.: Speech enhancement based on perceptually motivated Bayesian estimators of the magnitude spectrum. IEEE Trans. Audio, Speech, and Language Processing, 857-869 (2005)
Dubbelboer, F., Houtgast, T.: The concept of signal-to-noise ratio in the modulation domain and speech intelligibility. J. Acoust. Sociaty America, 3937-3947 (2008)
Jorgensen, S., Dau, T.: Predicting speech intelligibility based on the signal-to-noise envelope power ratio after modulation-frequency selective processing. J. Acoust. Sociaty America 1475-1487 (2011)
Paliwal, K., Schwerin, B., Wojcicki, K.: Role of modulation magnitude and phase spectrum towards speech intelligibility. Speech Communication, 327-339 (2011)
Wojcicki, K., Loizou, P.C.: Channel selection in the modulation domain for improved speech intelligibility in noise. J. Acoust. Sociaty America, 2904-2913 (2012)
Guoshen, Y., Bacry, E., Mallat, S.: Audio signal denoising with complex wavelets and adaptive block attenuation. In: Proc. IEEE Int. Conf. Acoustic, Speech Signal Processing (ICASSP), vol. 3, pp. 869–872 (2007)
Zhou, B., et al.: An improved wavelet-based speech enhancement method using adaptive block thresholding. In: IEEE Conference (2010)
Sanam, T.F., Shahnaz, C.: Enhancement of noisy speech based on a custom thresholding function with a statistically determined threshold. Int. J. Speech Technology (April 2012)
Donoho, D.L.: De-noising by soft thresholding. IEEE Trans. Inform. Theory 41, 613–627 (1995)
Donoho, D.L., Johnstone, I.M.: Ideal spatial adaptation by wavelet shrinkage. Biometrika 81, 425–455 (1994)
Yasser, G., Karami, M.R.: A new approach for speech enhancement based on the adaptive thresholding of the wavelet packets. Speech Communication 48, 927–940 (2006)
Rangachari, S., Loizou, P.C.: A noise-estimation algorithm for highly non-stationary environments. Speech Communication 48, 220–231 (2006)
Prahallad, K., Kumar, E.N., Keri, V.: The IIIT-H Indic Speech Databases. In: Proceedings of Interspeech, Portland, Oregon, USA (2012), http://speech.iiit.ac.in/index.php/research-svl/69.html
Varga, P., Steeneken, H.J.M.: Technical report, DRA Speech Research Unit) (1992), http://www.speech.cs.cmu.edu/comp.speech/Sect-ion1/Data/noisex.html
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer International Publishing Switzerland
About this paper
Cite this paper
Singh, S., Tripathy, M., Anand, R.S. (2014). Single Channel Speech Enhancement for Mixed Non-stationary Noise Environments. In: Thampi, S., Gelbukh, A., Mukhopadhyay, J. (eds) Advances in Signal Processing and Intelligent Recognition Systems. Advances in Intelligent Systems and Computing, vol 264. Springer, Cham. https://doi.org/10.1007/978-3-319-04960-1_47
Download citation
DOI: https://doi.org/10.1007/978-3-319-04960-1_47
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-04959-5
Online ISBN: 978-3-319-04960-1
eBook Packages: EngineeringEngineering (R0)