Abstract
In modern communication system, speech communication is almost utilized in vast range of applications. Usually, during transmission of speech signal, environment interference causes degradation of signal. Few speech interference which affects quality of speech signal are acoustic noise, acoustic reverberation or white noise. In this research work, it is aimed to estimate the noise in the speech signal using Recurrent Function Network (RFN). The proposed technique is termed as Recurrent RATS Function Network (RRFN). The proposed network estimates the different noise exists in the input noisy speech signal. Once the noises are identified in speech signal, features are estimated using novel radial based RATS (Robust Automatic Transcription of Speech) approach. Further to enhance the clarity of speech signal, a novel generalized recursive singular value technique integrated in elliptic filter is used to effectively remove noises in the speech signal. Simulation analysis is performed for proposed RFN and compared with existing techniques in terms of PESQ and STOI. The proposed method exhibits good performance improvement over the existing techniques for different SNR levels.
Similar content being viewed by others
References
Achanta S, Gangashetty SV (2017) Deep Elman recurrent neural networks for statistical parametric speech synthesis. Speech Comm 93:31–42
Chaudhari A, Dhonde SB (2015) A review on speech enhancement techniques. International Conference on Pervasive Computing (ICPC)
Daneshfar F, Kabudian S (2020) J and Neekabadi a, “speech emotion recognition using hybrid spectral-prosodic features of speech signal/glottal waveform, metaheuristic-based dimensionality reduction, and Gaussian elliptical basis function network classifier”. Appl Acoust 166:107–360
Djendi M, Bendoumia R (2016) Improved subband-forward algorithm for acoustic noise reduction and speech quality enhancement. Appl oft Comput 42:132–143
Djendi, Mohamed (2016) An efficient frequency-domain adaptive forward BSS algorithm for acoustic noise reduction and speech quality enhancement. Comput Electrical Eng 52:12–27
Drioli C (2001) Radial basis function networks for conversion of sound spectra. EURASIP J Advanc Signal Process 1:36–44
EV and Harinarayanan (2017), “A Novel Automatic Noise Removal Technique for Audio and Speech Signals”, Audio Engineering Society Convention.
Hansen PC, Jensen SH (2006) Subspace-based noise reduction for speech signals via diagonal and triangular matrix decompositions. Constraints 28
Henni R, Djendi M, Djebari M (2019) A new efficient two-channel fast transversal adaptive filtering algorithm for blind speech enhancement and acoustic noise reduction. Comput Electrical Eng 73:349–368
Juang CF, Cheng CN, Chen TM (2009) Speech detection in noisy environments by wavelet energy-based recurrent neural fuzzy network. Expert Syst Appl 36(1):321–332
Kadiri SR, Yegnanarayana B (2020) Determination of glottal closure instants from clean and telephone quality speech signals using single frequency filtering. Comput Speech Language:101–197
Kadiri SR, Prasad R, Yegnanarayana B (2020) Detection of glottal closure instant and glottal open region from speech signals using spectral flatness measure. Speech Comm 116:30–43
Kasthuri ES, James AP (2012) Speech Filters for Speech Signal Noise Reduction. Int J Comput Appl 975
Kohli R, Gupta S (2019) A nascent approach for noise reduction via EMD thresholding. Ambient Comm Comput Syst:55–65
Kulkarni DS, Deshmukh RR, Shrishrimal PP (2016) A review of speech signal enhancement techniques. Int J Comput Appl 139(14)
Lakshmikanth S, Natraj KR, Rekha KR (2014) Noise cancellation in speech signal processing-a review. Int J Advanc Res Comput Commun Eng 3(1):5175–5186
Lezzoum N, Gagnon G, Voix J (2016) Noise reduction of speech signals using time-varying and multi-band adaptive gain control for smart digital hearing protectors. Appl Acoustics, 109:37–43
Li A, Yuan M, Zheng C, Li X (2020) Speech enhancement using progressive learning-based convolutional recurrent neural network. Appl Acoust 166:107347
Mehrkian S, Bayat Z, Javanbakht M, Emamdjomeh H, Bakhshi E (2019) Effect of wireless remote microphone application on speech discrimination in noise in children with cochlear implants. Int J Pediatr Otorhinolaryngol 125:192–195
Ng and Tim (n.d.) “Developing a speech activity detection system for the DARPA RATS program”, Thirteenth annual conference of the international speech communication association
Peng J, Zhao L, Jiang Y (2019) Investigation of word recognition for the elderly in speech and noise spatial separation. Appl Acoust 153:48–52
Podder P, Hasan M, Islam M, & Sayeed M (2020) Design and implementation of Butterworth, Chebyshev-I and elliptic filter for speech signal analysis. arXiv preprint arXiv:2002.03130.
Rajini GK, Harikrishnan V (2019) A Research on Different Filtering Techniques and Neural Networks Methods for Denoising Speech Signals. Int J Innov Technol Exploring Eng (IJITEE) 8(9S2)
Rix AW, Beerends JG, Hollier MP, Hekstra AP (2001) Perceptual evaluation of speech quality (PESQ)-a new method for speech quality assessment of telephone networks and codecs. Proceed IEEE Int Conf Acoust, Speech, Signal Process 2:749–752
Shrawankar U, Thakare V (2010) Noise estimation and noise removal techniques for speech recognition in adverse environment. Int Conf Intell Inform Process
Sudro PN, Prasanna SM (2020) Enhancement of cleft palate speech using temporal and spectral processing. Speech Comm
Sun Z, Li Y, Jiang H, Chen F, Xie X, Wang Z (2020) A Supervised Speech Enhancement Method for Smartphone-Based Binaural Hearing Aids. IEEE Trans Biomed Circ Syst
Taal CH, Hendriks RC, Heusdens R, Jensen J (2010) A short-time objective intelligibility measure for time-frequency weighted noisy speech. In 2010 IEEE international conference on acoustics, speech and signal processing (pp. 4214-4217). IEEE.
Taha TMF, Hussain A (2018) A survey on techniques for enhancing speech. Int J Comput Appl 179(17):1–14
Tan ZH, Dehak N (2020) rVAD: an unsupervised segment-based robust voice activity detection method. Comput Speech Lang 59:1–21
Tu J, Xia Y, Zhang S (2017) A complex-valued multichannel speech enhancement learning algorithm for optimal tradeoff between noise reduction and speech distortion. Neurocomputing 267:333–343
Vanus J, Weiper T, Martinek R, Nedoma J, Fajkus M, Koval L, Hrbac R (2018) Assessment of the quality of speech signal processing within voice control of operational-technical functions in the smart home by means of the PESQ algorithm. IFAC-PapersOnLine 51(6):202–207
Wang W, Zhang G, Yang L, Balaji VS, Elamaran V, Arunkumar N (2019) Revisiting signal processing with spectrogram analysis on EEG, ECG and speech signals. Futur Gener Comput Syst 98:227–232
Yang L, Mingli X, Yong T (2013) “A noise reduction method based on LMS adaptive filter of audio signals”, 3rd Int Conf Multimedia Technol (ICMT-13)
Yong PC, Nordholm S, Dam HH, Low SY (2011) On the optimization of sigmoid function for speech enhancement. In IEEE 19th European signal processing conference 211-215.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Srinivasarao, V. An efficient recurrent Rats function network (Rrfn) based speech enhancement through noise reduction. Multimed Tools Appl 81, 30599–30614 (2022). https://doi.org/10.1007/s11042-022-12473-3
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-022-12473-3