Efficient Speech Enhancement Using Recurrent Convolution Encoder and Decoder

Karthik, A.; MazherIqbal, J. L.

doi:10.1007/s11277-021-08313-6

Efficient Speech Enhancement Using Recurrent Convolution Encoder and Decoder

Published: 06 March 2021

Volume 119, pages 1959–1973, (2021)
Cite this article

Wireless Personal Communications Aims and scope Submit manuscript

515 Accesses
Explore all metrics

Abstract

The accuracy of voice or speech recognition is affected due to the presence of various background noises present in the surroundings. Automatic Speech Recognition communication systems are utilized for enhancing the speech by either reducing or eliminating the surrounding noises. The corrupted speech signals are enhanced by using different techniques. In this paper, Recurrent Convolutional Encoder-Decoder (R-CED) network is proposed for enhancing the speech by the elimination of noise signals. The efficiency of the proposed work is determined by comparing the performance metrics like PESQ, STOI and CER with various existing techniques. From the results obtained, it can be confirmed that the efficiency of proposed R-CED is higher and optimal when compared to the existing techniques.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Convolutional Encoder–Decoder Architecture for Speech Enhancement

A Literature Survey on Speech Enhancement Based on Deep Neural Network Technique

An intelligent speech enhancement model using enhanced heuristic-based residual convolutional neural network with encoder-decoder architecture

Article 17 July 2024

References

Zhao, H., Zarar, S., Tashev, I., Lee, C.-H. (2018). Convolutional-recurrent neural networks for speech enhancement. In 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 2401–2405.
Liu, H.-P., Tsao, Y., & Fuh, C.-S. (2018). Bone-conducted speech enhancement using deep denoisingautoencoder. Speech Communication, 104, 106–112.
Article Google Scholar
Tu, J., & Xia, Y. (2018). Effective Kalman filtering algorithm for distributed multichannel speech enhancement. Neurocomputing, 275, 144–154.
Article Google Scholar
He, Q., Bao, F., & Bao, C. (2017). Multiplicative update of auto-regressive gains for codebook-based speech enhancement. IEEE/ACM Transactions on Audio Speech and Language Processing (TASLP), 25, 457–468.
Article Google Scholar
Henni, R., Djendi, M., & Djebari, M. (2019). A new efficient two-channel fast transversal adaptive filtering algorithm for blind speech enhancement and acoustic noise reduction. Computers & Electrical Engineering, 73, 349–368.
Article Google Scholar
Malathi, P., Suresh, G., Moorthi, M., Shanker, N. (2019). "Speech Enhancement via Smart Larynx of Variable Frequency for Laryngectomee Patient for Tamil Language Syllables Using RADWT Algorithm. Circuits, Systems, and Signal Processing, 1–27
Du, X., Zhu, M., Shi, X., Zhang, X., Zhang, W., Chen, J. (2019). End-to-End Model for Speech Enhancement by Consistent Spectrogram Masking. arXiv preprint arXiv:1901.00295
Bendoumia, R. (2019). Two-channel forward NLMS algorithm combined with simple variable step-sizes for speech quality enhancement. Analog Integrated Circuits and Signal Processing, 98, 27–40.
Article Google Scholar
Wang, Y., & Brookes, M. (2018). Model-based speech enhancement in the modulation domain. IEEE/ACM Transactions on Audio Speech and Language Processing (TASLP), 26, 580–594.
Article Google Scholar
Bando, Y., Mimura, M., Itoyama, K., Yoshii, K., Kawahara, T. (2018). Statistical speech enhancement based on probabilistic integration of variational autoencoder and non-negative matrix factorization. In 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 716–720.
Donahue, C., Li, B., Prabhavalkar, R. (2018). Exploring speech enhancement with generative adversarial networks for robust speech recognition. In 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5024–5028.
Pascual, S., Park, M., Serrà, J., Bonafonte, A., Ahn, K.-H. (2018). Language and noise transfer in speech enhancement generative adversarial network. In 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5019–5023.
Xue, W., Moore, A. H., Brookes, M., Naylor, P.A. (2018). Modulation-Domain Parametric Multichannel Kalman Filtering for Speech Enhancement. In 2018 26th European Signal Processing Conference (EUSIPCO), pp. 2509–2513.
Leng, X., Chen, J., Benesty, J., Cohen, I. (2018). On Speech Enhancement Using Microphone Arrays in the Presence of Co-Directional Interference. In 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 511–515.
Bando, Y., Itoyama, K., Konyo, M., Tadokoro, S., Nakadai, K., Yoshii, K., et al. (2017). Speech enhancement based on Bayesian low-rank and sparse decomposition of multichannel magnitude spectrograms. IEEE/ACM Transactions on Audio, Speech, and Language Processing, 26, 215–230.
Article Google Scholar
Pandey, A., Wang, D. (2018). On adversarial training and loss functions for speech enhancement. In 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5414–5418.
Baby, D. (2020). "isegan: Improved speech enhancement generative adversarial networks," arXiv preprint arXiv:2002.08796.
Xia, Y., Stern, R. (2018). A Priori SNR Estimation Based on a Recurrent Neural Network for Robust Speech Enhancement. In Interspeech, pp. 3274–3278.
Li, R., Sun, X., Li, T., & Zhao, F. (2020). A multi-objective learning speech enhancement algorithm based on IRM post-processing with joint estimation of SCNN and TCNN. Digital Signal Processing, 101, 102731.
Article Google Scholar
Phan, H., McLoughlin, I. V., Pham, L., Chén, O. Y., Koch, P., De Vos, M., et al. (2020). Improving gans for speech enhancement. IEEE Signal Processing Letters, 27, 1700–1704.
Article Google Scholar
Zhao, Y., Xu, B., Giri, R., Zhang, T. (2018). Perceptually guided speech enhancement using deep neural networks. In 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5074–5078.
Das, N., Chakraborty, S., Chaki, J., Padhy, N., Dey, N. (2020). Fundamentals, present and future perspectives of speech enhancement. International Journal of Speech Technology, 1–19
Kavalekalam, M. S., Nielsen, J. K., Boldt, J. B., & Christensen, M. G. (2018). Model-based speech enhancement for intelligibility improvement in binaural hearing aids. IEEE/ACM Transactions on Audio, Speech, and Language Processing, 27, 99–113.
Article Google Scholar
Hussain, T., Siniscalchi, S. M., Lee, C.-C., Wang, S.-S., Tsao, Y., & Liao, W.-H. (2017). Experimental study on extreme learning machine applications for speech enhancement. IEEE Access, 5, 25542–25554.
Article Google Scholar
Wolff, T., Matheja, T., Buck, M. (2019). System and method for speech enhancement using a coherent to diffuse sound ratio," ed: Google Patents.

Download references

Author information

Authors and Affiliations

Electronics and Communication Engineering, Veltech Rangarajan Dr. Sagunthala R&D Institute of Science and Technology, Chennai, India
A. Karthik & J. L. MazherIqbal

Authors

A. Karthik
View author publications
You can also search for this author inPubMed Google Scholar
J. L. MazherIqbal
View author publications
You can also search for this author inPubMed Google Scholar

Corresponding author

Correspondence to A. Karthik.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Karthik, A., MazherIqbal, J.L. Efficient Speech Enhancement Using Recurrent Convolution Encoder and Decoder. Wireless Pers Commun 119, 1959–1973 (2021). https://doi.org/10.1007/s11277-021-08313-6

Download citation

Accepted: 18 February 2021
Published: 06 March 2021
Issue Date: August 2021
DOI: https://doi.org/10.1007/s11277-021-08313-6

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Efficient Speech Enhancement Using Recurrent Convolution Encoder and Decoder

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Convolutional Encoder–Decoder Architecture for Speech Enhancement

A Literature Survey on Speech Enhancement Based on Deep Neural Network Technique

An intelligent speech enhancement model using enhanced heuristic-based residual convolutional neural network with encoder-decoder architecture

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now