Improved Empirical Mode Decomposition Using Optimal Recursive Averaging Noise Estimation for Speech Enhancement

Bouchair, Asma; Selouani, Sid Ahmed; Amrouche, Abderrahmane; Sidi Yakoub, Mohammed

doi:10.1007/s00034-021-01767-w

Improved Empirical Mode Decomposition Using Optimal Recursive Averaging Noise Estimation for Speech Enhancement

Published: 21 June 2021

Volume 41, pages 196–223, (2022)
Cite this article

Circuits, Systems, and Signal Processing Aims and scope Submit manuscript

Asma Bouchair^1,2,
Sid Ahmed Selouani ORCID: orcid.org/0000-0003-0731-2632¹,
Abderrahmane Amrouche² &
…
Mohammed Sidi Yakoub¹

498 Accesses
6 Citations
1 Altmetric
Explore all metrics

Abstract

In this paper, a new approach for robust speech enhancement based on improved ensemble empirical mode decomposition (EMD) using optimized log-spectral amplitude noise estimation is presented. In this approach, a noisy signal is decomposed adaptively into a sum of oscillating components that belong to intrinsic mode functions (IMFs); then, each component is enhanced separately to provide less-corrupted IMFs that are used by the Hurst exponent method to construct an estimate of a clean signal. This new framework takes advantage of adaptive noise estimation performed by improved minima-controlled recursive averaging for noise estimation and optimally modified log-spectral amplitude to enhance the noisy EMD components. Through experimental evidence, the objective evaluation of quality and intelligibility demonstrates that the proposed method performs significantly better than the baseline techniques, including the most recently developed EMD-based speech enhancement methods.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Speech Denoising Based on Empirical Mode Decomposition and Improved Thresholding

Robust Speech Analysis Based on Source-Filter Model Using Multivariate Empirical Mode Decomposition in Noisy Environments

Speech Enhancement: A Multivariate Empirical Mode Decomposition Approach

Data availability statement

The datasets and code generated and/or analyzed during the current study are available from Asma Bouchair (first author) on reasonable request.

References

Albu F, Dumitriu N, Stanciu L D (1996) Speech Enhancement by Spectral Subtraction, Proceedings of International Symposium on Electronics and Telecommunications, Bucharest, Romania: pp.78–83.
I. Cohen, Optimal speech enhancement under signal presence uncertainty using log-spectral amplitude estimator. IEEE Signal Process. Lett. 9, 113–116 (2002)
Article Google Scholar
I. Cohen, Noise spectrum estimation in adverse environments: improved minima controlled recursive averaging. IEEE Trans. Speech and Audio Process. 11, 466–475 (2003)
Article Google Scholar
I. Cohen, B. Berdugo, Noise estimation by minima controlled recursive averaging for robust speech enhancement, IEEE Signal Process. Lett. 9, 12–15 (2002)
Google Scholar
M.A. Colominas, G. Schlotthauer, M.E. Torres, Improved complete ensemble EMD: a suitable tool for biomedical signal processing. Biomed. Signal Process and Control 14, 19–29 (2014)
Article Google Scholar
N. Chatlani, J. Soraghan, EMD-based filtering (EMDF) of low-frequency noise for speech enhancement, IEEE Trans. Audio, Speech, and Language Process. 20, 1158–1166 (2012)
Article Google Scholar
Y. Cheng, Z. Wang, B. Chen, W. Zhang, G. Huang, An improved complementary ensemble empirical mode decomposition with adaptive noise and its application to rolling element bearing fault diagnosis. ISA Transations 91, 218–234 (2019)
Article Google Scholar
Chen Z, Watanabe S, Erdogan H, Hershey J R (2015) Speech enhancement and recognition using multi-task learning of long short-term memory recurrent neural networks, Int. Speech Com. Assoc. Conf. Interspeech, pp. 3274 –3278.
A.K. Dwivedi, H. Ranjan, A. Menon, P. Periasamy, Noise reduction in ECG signal using combined ensemble empirical mode decomposition method with stationary wavelet transform. Circuits Systems Signal Process. 40, 827–844 (2021)
Article Google Scholar
D.L. Donoho, De-noising by soft-thresholding. IEEE Trans. Inf. Theory 41, 613–627 (1995)
Article MathSciNet Google Scholar
I. Daubechies, Ten Lectures on Wavelets (Society for Industrial and Applied Mathematics, Philadelphia, USA, 1992)
Book Google Scholar
K. Dragomiretskiy, D. Zosso, Variational mode decomposition. IEEE Trans. Signal Process. 62, 531–544 (2014)
Article MathSciNet Google Scholar
Y. Ephraim, D. Malah, Speech enhancement using a minimum mean-square error short-time spectral amplitude estimator. IEEE Trans. Acoust. Speech Signal Process 32, 1109–1121 (1984)
Article Google Scholar
Y. Ephraim, D. Malah, Speech enhancement using a minimum mean-square error log-spectral amplitude estimator. IEEE Trans. Acoust. Speech Signal Process (ASSP) 33, 443–445 (1985)
Article Google Scholar
Flandrin P, Gonçalves P, Rilling G (2004) Detrending and denoising with empirical mode decompositions, Proc. European Signal Process. Conf., pp. 1581–1584.
Fu S W, Tsao Y, Lu X (2016) SNR aware convolutional neural network modeling for speech enhancement, Proc. Interspeech.
Fu S W, Hu T Y, Tsao Y, Lu X (2017) Complex spectrogram enhancement by convolutional neural network with multi-metrics learning, Proc. Mach. Learn. Signal Process.
S.W. Fu, T.W. Wang, Y. Tsao, X. Lu, H. Kawai, End-to-end waveform utterance enhancement for direct evaluation metrics optimization by fully convolutional neural networks, IEEE/ACM Trans. Audio, Speech, and Language Process. 26, 1570–1584 (2018)
Google Scholar
Garofolo J S, Lamel L F, Fisher W M, Fiscus J G, Pallett D S, Dahlgren N L (1993) The DARPA TIMIT acoustic-phonetic continuous speech corpus CDROM.
Huang N E, Shen Z, Long S, Wu M, Shih H, Zheng Q, Yen N, Tung C, Liu H(1998) The empirical mode decomposition and the Hilbert spectrum for nonlinear and non-stationary time series analysis, Proc. R. Soc. London pp. 903–995.
Y. Hu, P.C. Loizou, A generalized subspace approach for enhancing speech corrupted by colored noise, IEEE Trans. Speech and Audio Processing 11, 334–341 (2003)
Article Google Scholar
Y. Hu, P. Loizou, Evaluation of objective measures for speech enhancement. IEEE Trans. Audio Speech Lang. Process. 16, 229–238 (2008)
Article Google Scholar
ITU-T Rec. P.862 (2001) Perceptual evaluation of speech quality (PESQ), an objective method for end-to-end speech quality assessment of narrow-band telephone networks and speech codecs, ITU, Online : https://www.itu.int/rec/T-REC-P.862
F. Jabloun, B. Champagne, Incorporating the human hearing properties in the signal subspace approach for speech enhancement, IEEE Trans. Speech and Audio Processing 11, 700–708 (2003)
Article Google Scholar
B. Kumar, Comparative Performance Evaluation of Greedy Algorithms for Speech Enhancement System. Fluctuation and Noise Letters (2020). https://doi.org/10.1142/S0219477521500176
Article Google Scholar
Khaldi K, Boudraa A, Bouchikhi A, Alouane M (2008) Speech enhancement via EMD, EURASIP J. Adv. Signal Process. Article ID 873204.
Lu X, Tsao Y, Matsuda S, Hori C (2013) Speech enhancement based on deep denoising autoencoder, Int Speech Commun Assoc. Conf. Interspeech, pp. 436–440.
N. Mohammadiha, P. Smaragdis, A. Leijon, Supervised and unsupervised speech enhancement using nonnegative matrix factorization. IEEE Trans. Speech, and Language Processing 21, 2140–2151 (2013)
Article Google Scholar
Pascual S, Bonafonte A, Serr J (2017) Segan: Speech enhancement generative adversarial network, Proc. Interspeech, pp. 642–3646.
Park S R, Lee J (2017) A fully convolutional neural network for speech enhancement, Proc. Interspeech.
M.S. Rudramurthy, N.K. Pathak, V.K. Prasad, R. Kumaraswamy, Speaker Identification Using Empirical Mode Decomposition-Based Voice Activity Detection Algorithm under Realistic Conditions. J. Intell. Syst. 23(4), 405–421 (2014)
Article Google Scholar
Scalart P, Filho J V (1996) Speech enhancement based on a priori signal to noise estimation, Proc. IEEE Int. Conf. Acoust. Speech Signal Process, pp. 629–632.
R. Sharma, S.R.M. Prasanna, A better decomposition of speech obtained using modified Empirical Mode Decomposition. Digital Signal Processing 58, 26–39 (2016)
Article Google Scholar
R. Sharma, L. Vignolo, G. Schlotthauer, M.A. Colominas, L. Rufiner, S.R.M. Prasanna, Empirical Mode Decomposition for adaptive AM-FM analysis of speech: A review. Speech Commun. 88, 39–64 (2017)
Article Google Scholar
Torres M E, Colominas M A, Schlotthauer G, Flandrin P (2011) A complete ensemble empirical mode decomposition with adaptive noise, Proc. 36th IEEE Int. Conf. Acoust. Speech and Signal Process (ICASSP), pp. 4144–4147.
A. Upadhyay, R.B. Pachori, Speech enhancement based on mEMD-VMD method. Electron. Lett. 53, 502–504 (2017)
Article Google Scholar
D. Veitch, P. Abry, A wavelet-based joint estimator of the parameters of long-range dependence. IEEE Trans. Inf. Theory 45, 878–897 (1999)
Article MathSciNet Google Scholar
S.R. Vumanthala, B. Kalagadda, Nonlocal means estimation of intrinsic mode functions for speech enhancement. Turk J Elec Eng & Comp Sci 28, 318–330 (2020)
Article Google Scholar
C. Wang, H. Li, D. Zhao, A preconditioning framework for the empirical mode decomposition method. Circuits Systems Signal Process. 37, 5417–5440 (2018)
Article Google Scholar
Weninger F, Eyben F, Schuller B (2014) Single-channel speech separation with memory-enhanced recurrent neural networks, Proc. ICASSP, pp. 3709–3713.
Z. Wu, N.E. Huang, Ensemble empirical mode decomposition: a noise-assisted data analysis method. Adv. Adapt. Data Anal. 1, 1–41 (2009)
Article Google Scholar
J.-R. Yeh, J.-S. Shieh, N.E. Huang, Complementary ensemble empirical mode decomposition: a novel noise enhanced data analysis method. Adv. Adapt. Data Anal. 2, 135–156 (2010)
Article MathSciNet Google Scholar
X. Ye, Y. Hu, J. Shen, R. Feng, G. Zhai, An Improved Empirical Mode Decomposition Based on Adaptive Weighted Rational Quartic Spline for Rolling Bearing Fault Diagnosis. IEEE Access 8, 123813–123827 (2020). https://doi.org/10.1109/ACCESS.2020.3006030
Article Google Scholar
D. Zhao, Z. Huang, H. Li, J. Chen, P. Wang, An improved EEMD method based on the adjustable cubic trigonometric cardinal spline interpolation. Digital Signal Processing 64, 41–48 (2017)
Article MathSciNet Google Scholar
J. Zheng, H. Pan, Mean-optimized mode decomposition: An improved EMD approach for non-stationary signal processing. ISA Trans. 106, 392–401 (2020)
Article Google Scholar
L. Zão, R. Coelho, P. Flandrin, Speech enhancement with EMD and Hurst-based mode selection, IEEE/ACM Trans. Audio, Speech, and Language Process. 22, 899–911 (2014)
Google Scholar

Download references

Funding

This work was supported by the Natural Sciences and Engineering Research Council of Canada (NSERC) under reference number RGPIN-2018-05221 and by the Ministry of Higher Education and Scientific Research of Algeria.

Author information

Authors and Affiliations

Université de Moncton, Shippagan campus, 128 Boul. J-D. Gauthier, Shippagan, NB, E8S 1P6, Canada
Asma Bouchair, Sid Ahmed Selouani & Mohammed Sidi Yakoub
University of Science and Technology Houari Boumediene, B.P. 32 EL Alia, 16111, Algiers, Algeria
Asma Bouchair & Abderrahmane Amrouche

Authors

Asma Bouchair
View author publications
You can also search for this author in PubMed Google Scholar
Sid Ahmed Selouani
View author publications
You can also search for this author in PubMed Google Scholar
Abderrahmane Amrouche
View author publications
You can also search for this author in PubMed Google Scholar
Mohammed Sidi Yakoub
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

A new empirical mode decomposition method for speech enhancement is proposed. This method is able to manage nonlinear distortions caused by noise effects and can reduce noise separately for each frequency range, making it useful for a wide range of noise. Through experimental evidence, the objective assessment of quality and intelligibility demonstrates that the proposed method performs much better than the state-of-the-art techniques, including the most recently developed EMD- and deep learning-based speech enhancement methods. This work has not been published previously, and it is not under consideration for publication elsewhere.

Corresponding author

Correspondence to Sid Ahmed Selouani.

Ethics declarations

Conflicts of interest

Not applicable.

Availability of data and material

The data, namely the TIMIT corpus and the NOISEX-92 dataset, are publicly available.

Code availability

The code will be made available under the GitHub platform and upon request.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Bouchair, A., Selouani, S.A., Amrouche, A. et al. Improved Empirical Mode Decomposition Using Optimal Recursive Averaging Noise Estimation for Speech Enhancement. Circuits Syst Signal Process 41, 196–223 (2022). https://doi.org/10.1007/s00034-021-01767-w

Download citation

Received: 30 August 2020
Revised: 08 June 2021
Accepted: 09 June 2021
Published: 21 June 2021
Issue Date: January 2022
DOI: https://doi.org/10.1007/s00034-021-01767-w

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Improved Empirical Mode Decomposition Using Optimal Recursive Averaging Noise Estimation for Speech Enhancement

Abstract

Access this article

Similar content being viewed by others

Speech Denoising Based on Empirical Mode Decomposition and Improved Thresholding

Robust Speech Analysis Based on Source-Filter Model Using Multivariate Empirical Mode Decomposition in Noisy Environments

Speech Enhancement: A Multivariate Empirical Mode Decomposition Approach

Data availability statement

References

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Conflicts of interest

Availability of data and material

Code availability

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Improved Empirical Mode Decomposition Using Optimal Recursive Averaging Noise Estimation for Speech Enhancement

Abstract

Access this article

Similar content being viewed by others

Speech Denoising Based on Empirical Mode Decomposition and Improved Thresholding

Robust Speech Analysis Based on Source-Filter Model Using Multivariate Empirical Mode Decomposition in Noisy Environments

Speech Enhancement: A Multivariate Empirical Mode Decomposition Approach

Data availability statement

References

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Conflicts of interest

Availability of data and material

Code availability

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation