Skip to main content
Log in

Improved Empirical Mode Decomposition Using Optimal Recursive Averaging Noise Estimation for Speech Enhancement

  • Published:
Circuits, Systems, and Signal Processing Aims and scope Submit manuscript

Abstract

In this paper, a new approach for robust speech enhancement based on improved ensemble empirical mode decomposition (EMD) using optimized log-spectral amplitude noise estimation is presented. In this approach, a noisy signal is decomposed adaptively into a sum of oscillating components that belong to intrinsic mode functions (IMFs); then, each component is enhanced separately to provide less-corrupted IMFs that are used by the Hurst exponent method to construct an estimate of a clean signal. This new framework takes advantage of adaptive noise estimation performed by improved minima-controlled recursive averaging for noise estimation and optimally modified log-spectral amplitude to enhance the noisy EMD components. Through experimental evidence, the objective evaluation of quality and intelligibility demonstrates that the proposed method performs significantly better than the baseline techniques, including the most recently developed EMD-based speech enhancement methods.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8

Similar content being viewed by others

Data availability statement

The datasets and code generated and/or analyzed during the current study are available from Asma Bouchair (first author) on reasonable request.

References

  1. Albu F, Dumitriu N, Stanciu L D (1996) Speech Enhancement by Spectral Subtraction, Proceedings of International Symposium on Electronics and Telecommunications, Bucharest, Romania: pp.78–83.

  2. I. Cohen, Optimal speech enhancement under signal presence uncertainty using log-spectral amplitude estimator. IEEE Signal Process. Lett. 9, 113–116 (2002)

    Article  Google Scholar 

  3. I. Cohen, Noise spectrum estimation in adverse environments: improved minima controlled recursive averaging. IEEE Trans. Speech and Audio Process. 11, 466–475 (2003)

    Article  Google Scholar 

  4. I. Cohen, B. Berdugo, Noise estimation by minima controlled recursive averaging for robust speech enhancement, IEEE Signal Process. Lett. 9, 12–15 (2002)

    Google Scholar 

  5. M.A. Colominas, G. Schlotthauer, M.E. Torres, Improved complete ensemble EMD: a suitable tool for biomedical signal processing. Biomed. Signal Process and Control 14, 19–29 (2014)

    Article  Google Scholar 

  6. N. Chatlani, J. Soraghan, EMD-based filtering (EMDF) of low-frequency noise for speech enhancement, IEEE Trans. Audio, Speech, and Language Process. 20, 1158–1166 (2012)

    Article  Google Scholar 

  7. Y. Cheng, Z. Wang, B. Chen, W. Zhang, G. Huang, An improved complementary ensemble empirical mode decomposition with adaptive noise and its application to rolling element bearing fault diagnosis. ISA Transations 91, 218–234 (2019)

    Article  Google Scholar 

  8. Chen Z, Watanabe S, Erdogan H, Hershey J R (2015) Speech enhancement and recognition using multi-task learning of long short-term memory recurrent neural networks, Int. Speech Com. Assoc. Conf. Interspeech, pp. 3274 –3278.

  9. A.K. Dwivedi, H. Ranjan, A. Menon, P. Periasamy, Noise reduction in ECG signal using combined ensemble empirical mode decomposition method with stationary wavelet transform. Circuits Systems Signal Process. 40, 827–844 (2021)

    Article  Google Scholar 

  10. D.L. Donoho, De-noising by soft-thresholding. IEEE Trans. Inf. Theory 41, 613–627 (1995)

    Article  MathSciNet  Google Scholar 

  11. I. Daubechies, Ten Lectures on Wavelets (Society for Industrial and Applied Mathematics, Philadelphia, USA, 1992)

    Book  Google Scholar 

  12. K. Dragomiretskiy, D. Zosso, Variational mode decomposition. IEEE Trans. Signal Process. 62, 531–544 (2014)

    Article  MathSciNet  Google Scholar 

  13. Y. Ephraim, D. Malah, Speech enhancement using a minimum mean-square error short-time spectral amplitude estimator. IEEE Trans. Acoust. Speech Signal Process 32, 1109–1121 (1984)

    Article  Google Scholar 

  14. Y. Ephraim, D. Malah, Speech enhancement using a minimum mean-square error log-spectral amplitude estimator. IEEE Trans. Acoust. Speech Signal Process (ASSP) 33, 443–445 (1985)

    Article  Google Scholar 

  15. Flandrin P, Gonçalves P, Rilling G (2004) Detrending and denoising with empirical mode decompositions, Proc. European Signal Process. Conf., pp. 1581–1584.

  16. Fu S W, Tsao Y, Lu X (2016) SNR aware convolutional neural network modeling for speech enhancement, Proc. Interspeech.

  17. Fu S W, Hu T Y, Tsao Y, Lu X (2017) Complex spectrogram enhancement by convolutional neural network with multi-metrics learning, Proc. Mach. Learn. Signal Process.

  18. S.W. Fu, T.W. Wang, Y. Tsao, X. Lu, H. Kawai, End-to-end waveform utterance enhancement for direct evaluation metrics optimization by fully convolutional neural networks, IEEE/ACM Trans. Audio, Speech, and Language Process. 26, 1570–1584 (2018)

    Google Scholar 

  19. Garofolo J S, Lamel L F, Fisher W M, Fiscus J G, Pallett D S, Dahlgren N L (1993) The DARPA TIMIT acoustic-phonetic continuous speech corpus CDROM.

  20. Huang N E, Shen Z, Long S, Wu M, Shih H, Zheng Q, Yen N, Tung C, Liu H(1998) The empirical mode decomposition and the Hilbert spectrum for nonlinear and non-stationary time series analysis, Proc. R. Soc. London pp. 903–995.

  21. Y. Hu, P.C. Loizou, A generalized subspace approach for enhancing speech corrupted by colored noise, IEEE Trans. Speech and Audio Processing 11, 334–341 (2003)

    Article  Google Scholar 

  22. Y. Hu, P. Loizou, Evaluation of objective measures for speech enhancement. IEEE Trans. Audio Speech Lang. Process. 16, 229–238 (2008)

    Article  Google Scholar 

  23. ITU-T Rec. P.862 (2001) Perceptual evaluation of speech quality (PESQ), an objective method for end-to-end speech quality assessment of narrow-band telephone networks and speech codecs, ITU, Online : https://www.itu.int/rec/T-REC-P.862

  24. F. Jabloun, B. Champagne, Incorporating the human hearing properties in the signal subspace approach for speech enhancement, IEEE Trans. Speech and Audio Processing 11, 700–708 (2003)

    Article  Google Scholar 

  25. B. Kumar, Comparative Performance Evaluation of Greedy Algorithms for Speech Enhancement System. Fluctuation and Noise Letters (2020). https://doi.org/10.1142/S0219477521500176

    Article  Google Scholar 

  26. Khaldi K, Boudraa A, Bouchikhi A, Alouane M (2008) Speech enhancement via EMD, EURASIP J. Adv. Signal Process. Article ID 873204.

  27. Lu X, Tsao Y, Matsuda S, Hori C (2013) Speech enhancement based on deep denoising autoencoder, Int Speech Commun Assoc. Conf. Interspeech, pp. 436–440.

  28. N. Mohammadiha, P. Smaragdis, A. Leijon, Supervised and unsupervised speech enhancement using nonnegative matrix factorization. IEEE Trans. Speech, and Language Processing 21, 2140–2151 (2013)

    Article  Google Scholar 

  29. Pascual S, Bonafonte A, Serr J (2017) Segan: Speech enhancement generative adversarial network, Proc. Interspeech, pp. 642–3646.

  30. Park S R, Lee J (2017) A fully convolutional neural network for speech enhancement, Proc. Interspeech.

  31. M.S. Rudramurthy, N.K. Pathak, V.K. Prasad, R. Kumaraswamy, Speaker Identification Using Empirical Mode Decomposition-Based Voice Activity Detection Algorithm under Realistic Conditions. J. Intell. Syst. 23(4), 405–421 (2014)

    Article  Google Scholar 

  32. Scalart P, Filho J V (1996) Speech enhancement based on a priori signal to noise estimation, Proc. IEEE Int. Conf. Acoust. Speech Signal Process, pp. 629–632.

  33. R. Sharma, S.R.M. Prasanna, A better decomposition of speech obtained using modified Empirical Mode Decomposition. Digital Signal Processing 58, 26–39 (2016)

    Article  Google Scholar 

  34. R. Sharma, L. Vignolo, G. Schlotthauer, M.A. Colominas, L. Rufiner, S.R.M. Prasanna, Empirical Mode Decomposition for adaptive AM-FM analysis of speech: A review. Speech Commun. 88, 39–64 (2017)

    Article  Google Scholar 

  35. Torres M E, Colominas M A, Schlotthauer G, Flandrin P (2011) A complete ensemble empirical mode decomposition with adaptive noise, Proc. 36th IEEE Int. Conf. Acoust. Speech and Signal Process (ICASSP), pp. 4144–4147.

  36. A. Upadhyay, R.B. Pachori, Speech enhancement based on mEMD-VMD method. Electron. Lett. 53, 502–504 (2017)

    Article  Google Scholar 

  37. D. Veitch, P. Abry, A wavelet-based joint estimator of the parameters of long-range dependence. IEEE Trans. Inf. Theory 45, 878–897 (1999)

    Article  MathSciNet  Google Scholar 

  38. S.R. Vumanthala, B. Kalagadda, Nonlocal means estimation of intrinsic mode functions for speech enhancement. Turk J Elec Eng & Comp Sci 28, 318–330 (2020)

    Article  Google Scholar 

  39. C. Wang, H. Li, D. Zhao, A preconditioning framework for the empirical mode decomposition method. Circuits Systems Signal Process. 37, 5417–5440 (2018)

    Article  Google Scholar 

  40. Weninger F, Eyben F, Schuller B (2014) Single-channel speech separation with memory-enhanced recurrent neural networks, Proc. ICASSP, pp. 3709–3713.

  41. Z. Wu, N.E. Huang, Ensemble empirical mode decomposition: a noise-assisted data analysis method. Adv. Adapt. Data Anal. 1, 1–41 (2009)

    Article  Google Scholar 

  42. J.-R. Yeh, J.-S. Shieh, N.E. Huang, Complementary ensemble empirical mode decomposition: a novel noise enhanced data analysis method. Adv. Adapt. Data Anal. 2, 135–156 (2010)

    Article  MathSciNet  Google Scholar 

  43. X. Ye, Y. Hu, J. Shen, R. Feng, G. Zhai, An Improved Empirical Mode Decomposition Based on Adaptive Weighted Rational Quartic Spline for Rolling Bearing Fault Diagnosis. IEEE Access 8, 123813–123827 (2020). https://doi.org/10.1109/ACCESS.2020.3006030

    Article  Google Scholar 

  44. D. Zhao, Z. Huang, H. Li, J. Chen, P. Wang, An improved EEMD method based on the adjustable cubic trigonometric cardinal spline interpolation. Digital Signal Processing 64, 41–48 (2017)

    Article  MathSciNet  Google Scholar 

  45. J. Zheng, H. Pan, Mean-optimized mode decomposition: An improved EMD approach for non-stationary signal processing. ISA Trans. 106, 392–401 (2020)

    Article  Google Scholar 

  46. L. Zão, R. Coelho, P. Flandrin, Speech enhancement with EMD and Hurst-based mode selection, IEEE/ACM Trans. Audio, Speech, and Language Process. 22, 899–911 (2014)

    Google Scholar 

Download references

Funding

This work was supported by the Natural Sciences and Engineering Research Council of Canada (NSERC) under reference number RGPIN-2018-05221 and by the Ministry of Higher Education and Scientific Research of Algeria.

Author information

Authors and Affiliations

Authors

Contributions

A new empirical mode decomposition method for speech enhancement is proposed. This method is able to manage nonlinear distortions caused by noise effects and can reduce noise separately for each frequency range, making it useful for a wide range of noise. Through experimental evidence, the objective assessment of quality and intelligibility demonstrates that the proposed method performs much better than the state-of-the-art techniques, including the most recently developed EMD- and deep learning-based speech enhancement methods. This work has not been published previously, and it is not under consideration for publication elsewhere.

Corresponding author

Correspondence to Sid Ahmed Selouani.

Ethics declarations

Conflicts of interest

Not applicable.

Availability of data and material

The data, namely the TIMIT corpus and the NOISEX-92 dataset, are publicly available.

Code availability

The code will be made available under the GitHub platform and upon request.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Bouchair, A., Selouani, S.A., Amrouche, A. et al. Improved Empirical Mode Decomposition Using Optimal Recursive Averaging Noise Estimation for Speech Enhancement. Circuits Syst Signal Process 41, 196–223 (2022). https://doi.org/10.1007/s00034-021-01767-w

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00034-021-01767-w

Keywords

Navigation