Skip to main content
Log in

Enhancement of speech signal using diminished empirical mean curve decomposition-based adaptive Wiener filtering

  • Theoretical advances
  • Published:
Pattern Analysis and Applications Aims and scope Submit manuscript

Abstract

During the last few decades, speech signal enhancement has been one of the wide-spreading research topics. Numerous algorithms are being proposed to enhance the perceptibility and the quality of speech signal. These algorithms are often formulated to recover the clear signal from the signals that are ruined by noise. Usually, short-time Fourier transform and wavelet transform are widely used to process the speech signal. This paper attempts to overcome the regular drawbacks of the speech enhancement algorithms. As the frequency domain has good noise-removing ability, the short-time Fourier domain is also aimed to enhance the speech. Additionally, this paper introduces a decomposition model, named diminished empirical mean curve decomposition, to adaptively tune the Wiener filtering process and to accomplish effective speech enhancement. The performances of the proposed method and the conventional methods are compared, and it is observed that the proposed method is superior to the conventional methods.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13

Similar content being viewed by others

References

  1. Moore AH, Peso Parada P, Naylor PA (2016) Speech enhancement for robust automatic speech recognition: evaluation using a baseline system and instrumental measures. Comput Speech Lang 86:85–96

    Google Scholar 

  2. Zao L, Coelho R, Flandrin P (2014) Speech enhancement with EMD and hurst-based mode selection. IEEE/ACM Trans Audio Speech Lang Process 22(5):899–911

    Article  Google Scholar 

  3. Xu Y, Du J, Dai LR, Lee CH (2015) A regression approach to speech enhancement based on deep neural networks. IEEE/ACM Trans Audio Speech Lang Process 23(1):7–19

    Article  Google Scholar 

  4. Aroudi A, Veisi H, Sameti H (2015) Hidden Markov model-based speech enhancement using multivariate Laplace and Gaussian distributions. IET Signal Process 9(2):177–185

    Article  Google Scholar 

  5. Baby D, Virtanen T, Gemmeke JF, Van Hamme H (2015) Coupled dictionaries for exemplar-based speech enhancement and automatic speech recognition. IEEE/ACM Trans Audio Speech Lang Process 23(11):1788–1799

    Article  Google Scholar 

  6. Chen Z, Hohmann V (2015) Online monaural speech enhancement based on periodicity analysis and a priori SNR estimation. IEEE/ACM Trans Audio Speech Lang Process 23(11):1904–1916

    Google Scholar 

  7. Deng F, Bao C, Kleijn WB (2015) Sparse hidden Markov models for speech enhancement in non-stationary noise environments. IEEE/ACM Trans Audio Speech Lang Process 23(11):1973–1987

    Article  Google Scholar 

  8. Vihari S, Murthy AS, Soni P, Naik DC (2016) Comparison of speech enhancement algorithms. Procedia Comput Sci 89:666–676

    Article  Google Scholar 

  9. Doi H, Toda T, Nakamura K, Saruwatari H, Shikano K (2014) Alaryngeal speech enhancement based on one-to-many eigenvoice conversion. IEEE/ACM Trans Audio Speech Lang Process 22(1):172–183

    Article  Google Scholar 

  10. Gerkmann T, Krawczyk-Becker M, Le Roux J (2015) Phase processing for single-channel speech enhancement: history and recent advances. IEEE Signal Process Mag 32(2):55–66

    Article  Google Scholar 

  11. Islam MT, Shahnaz C, Zhu WP, Ahmad MO (2015) Speech enhancement based on student t modeling of teager energy operated perceptual wavelet packet coefficients and a custom thresholding function. IEEE/ACM Trans Audio Speech Lang Process 23(11):1800–1811

    Article  Google Scholar 

  12. Jin YG, Shin JW, Kim NS (2014) Spectro-temporal filtering for multichannel speech enhancement in short-time Fourier transform domain. IEEE Signal Process Lett 21(3):352–355

    Article  Google Scholar 

  13. Kim SM, Kim HK (2014) Direction-of-arrival based SNR estimation for dual-microphone speech enhancement. IEEE/ACM Trans Audio Speech Lang Process 22(12):2207–2217

    Article  Google Scholar 

  14. Ghanbari Y, Karami-Mollaei MR (2006) A new approach for speech enhancement based on the adaptive thresholding of the wavelet packets. Speech Commun 48(8):927–940

    Article  Google Scholar 

  15. Ephraim Y, Malah D (1985) Speech enhancement using a minimum mean-square error log-spectral amplitude estimator. IEEE Trans Acoust Speech Signal Process 33(2):443–445

    Article  Google Scholar 

  16. Cohen I (2004) Speech enhancement using a noncausal a priori SNR estimator. IEEE Signal Process Lett 11(9):725–728

    Article  Google Scholar 

  17. Berouti M, Schwartz R, Makhoul J (1979) Enhancement of speech corrupted by acoustic noise. In: IEEE international conference on acoustics, speech, and signal processing, ICASSP ‘79, pp 208–211

  18. Kamath S, Loizou P (2002) A multi-band spectral subtraction method for enhancing speech corrupted by colored noise. In: IEEE international conference on acoustics, speech, and signal processing (ICASSP). IEEE, Orlando, p IV-4164

  19. Lu Y, Loizou PC (2008) A geometric approach to spectral subtraction. Speech Commun 50(6):453–466

    Article  Google Scholar 

  20. Ayat S, Manzuri-Shalmani MT, Dianat R (2006) An improved wavelet-based speech enhancement by using speech signal features. Comput Electr Eng 32(6):411–425

    Article  Google Scholar 

  21. Balaji GN, Subashini TS, Chidambaram N (2015) Detection of heart muscle damage from automated analysis of echocardiogram video. IETE J Res 61(3):236–243

    Article  Google Scholar 

  22. Sunil Kumar BS, Manjunath AS, Christopher S (2018) Improved entropy encoding for high efficient video coding standard. Alexandria Eng J 57(1):1–9

    Article  Google Scholar 

  23. Wagh AM, Todmal SR (2015) Eyelids, eyelashes detection algorithm and Hough transform method for noise removal in iris recognition. Int J Comput Appl 112(3):28–31

    Google Scholar 

  24. Sreedharan NPN, Ganesan B, Raveendran R, Sarala P, Dennis B, Rajakumar BR (2018) Grey Wolf optimisation-based feature selection and classification for facial emotion recognition. IET Biom 7(5):490–499

    Article  Google Scholar 

  25. Bhowmick A, Chandra M (2017) Speech enhancement using voiced speech probability based wavelet decomposition. Comput Electr Eng 62:706–718

    Article  Google Scholar 

  26. Chung H, Plourde E, Champagne B (2017) Regularized non-negative matrix factorization with Gaussian mixtures and masking model for speech enhancement. Speech Commun 87:18–30

    Article  Google Scholar 

  27. Mowlaee P, Stahl J, Kulmer J (2017) Iterative joint MAP single-channel speech enhancement given non-uniform phase prior. Speech Commun 86:85–96

    Article  Google Scholar 

  28. Kammi S, Karami-Mollaei MR (2017) Noisy speech enhancement with sparsity regularization. Speech Commun 87:58–69

    Article  Google Scholar 

  29. Li R, Liu Y, Shi Y, Dong L, Cui W (2016) ILMSAF based speech enhancement with DNN and noise classification. Speech Commun 85:53–70

    Article  Google Scholar 

  30. Zhao Y, Qiu RC, Zhao X, Wang B (2016) Speech enhancement method based on low-rank approximation in a reproducing kernel Hilbert space. Appl Acoust 112:79–83

    Article  Google Scholar 

  31. Liu Y, Nower N, Morita S, Unoki M (2016) Speech enhancement of instantaneous amplitude and phase for applications in noisy reverberant environments. Speech Commun 84:1–14

    Article  Google Scholar 

  32. Sun M, Zhang X, Van Hamme H, Zheng TF (2016) Unseen noise estimation using separable deep auto encoder for speech enhancement. IEEE/ACM Trans Audio Speech Lang Process 24(1):93–104

    Article  Google Scholar 

  33. Chazan SE, Goldberger J, Gannot S (2016) A hybrid approach for speech enhancement using MoG model and neural network phoneme classifier. IEEE/ACM Trans Audio Speech Lang Process 24(12):2516–2530

    Article  Google Scholar 

  34. Wang SS et al (2016) Wavelet speech enhancement based on nonnegative matrix factorization. IEEE Signal Process Lett 23(8):1101–1105

    Article  Google Scholar 

  35. Bhatnagar K, Gupta S (2017) Extending the neural model to study the impact of effective area of optical fiber on laser intensity. Int J Intell Eng Syst 10(4):274–283

    Google Scholar 

  36. Muaidi H (2014) Levenberg–Marquardt learning neural network for part-of-speech tagging of arabic sentences. Wseas Trans Comput 13:300–309

    Google Scholar 

  37. Boll SF (1979) Suppression of acoustic noise in speech using spectral subtraction. IEEE Trans Signal Process 27(2):113–120

    Article  Google Scholar 

  38. Cohen I, Berdugo B (2001) Speech enhancement for non-stationary noise environments. Signal Process 81(11):2403–2418

    Article  Google Scholar 

  39. Plapous C, Marro C, Mauuary L, Scalart P (2004) A two-step noise reduction technique. In: 2004 IEEE international conference on acoustics, speech, and signal processing, vol 1, pp I-289–I292

  40. Plapous C, Marro C, Scalart P (2006) Improved signal-to-noise ratio estimation for speech enhancement. IEEE Trans ASLP 14(6):2098–2108

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Anil Garg.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Garg, A., Sahu, O.P. Enhancement of speech signal using diminished empirical mean curve decomposition-based adaptive Wiener filtering. Pattern Anal Applic 23, 179–198 (2020). https://doi.org/10.1007/s10044-018-00768-x

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10044-018-00768-x

Keywords

Navigation