Speech dereverberation and source separation using DNN-WPE and LWPR-PCA

Sheeja, Jasmine J. C.; Sankaragomathi, B.

doi:10.1007/s00521-022-07884-0

Speech dereverberation and source separation using DNN-WPE and LWPR-PCA

Original Article
Published: 08 January 2023

Volume 35, pages 7339–7356, (2023)
Cite this article

Neural Computing and Applications Aims and scope Submit manuscript

Jasmine J. C. Sheeja¹ &
B. Sankaragomathi²

499 Accesses
4 Citations
1 Altmetric
Explore all metrics

Abstract

Speech signals observed from distantly placed microphones may have some acoustic interference, such as noise and reverberation. These may lead to the degradation of the quality of blind speech. Hence, it is necessary to process the acquired speech signals to separate the blind source and eliminate the reverberation. Therefore, we proposed a novel speech separation and dereverberation method, which is based on the incorporation of Locally Weighted Projection Regression (LWPR)-based Principal Component Analysis (PCA) and Deep Neural Network (DNN)-based Weighted Prediction Error (WPE). The proposed method preprocesses the mixed reverberant signal prior to the application of Blind Source Separation (BSS) and Blind Dereverberation (BD). The preprocessing of the input sample signals is performed with the exploitation of fast Fourier transform (FFT) and whitening approaches to convert the time domain signal into frequency domain signal and to generate the transformation matrices. Besides, the utilization of LWPR-PCA can perform the BSS and the DNN-WPE can be used to conduct the BD. Moreover, the experimental analysis of our proposed method is compared with the existing RPCA-SNMF, CBF, BA-CNMF, AFMNMF, and ISC-LPKF approaches. The experimental outcomes depict that the proposed method effectively separates the original signal from the mixed reverberant signals.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

CNN-QTLBO: an optimal blind source separation and blind dereverberation scheme using lightweight CNN-QTLBO and PCDP-LDA for speech mixtures

Article 18 January 2022

Deep learning for speech denoising with improved Wiener approach

Article 07 October 2024

A New Neural Beamformer for Multi-channel Speech Separation

Article 09 May 2022

Discover the latest articles and news from researchers in related subjects, suggested using machine learning.

Artificial Intelligence

Availability of data and material

Data sharing is not applicable to this article as no new data were created or analyzed in this study.

References

Herzog A, Habets EA (2020) Direction and reverberation preserving noise reduction of ambisonics signals. IEEE/ACM Trans Audio Speech Lang Process 28:2461–2475
Article Google Scholar
Xiao Y, Lu W, Yan Q, Zhang H (2021) Blind separation of coherent multipath signals with impulsive interference and Gaussian noise in time-frequency domain. Signal Process 178:107750
Article Google Scholar
Gultepe E, Makrehchi M (2018) Improving clustering performance using independent component analysis and unsupervised feature learning. HCIS 8(1):1–19
Google Scholar
Sunohara M, Haruta C, Ono N (2017) March. Low-latency real-time blind source separation for hearing aids based on time-domain implementation of online independent vector analysis with truncation of non-causal components. In: 2017 IEEE International conference on acoustics, speech and signal processing (ICASSP), (pp 216–220). IEEE
Grezes F, Ni Z, Trinh VA, Mandel M (2020) Enhancement of spatial clustering-based time-frequency masks using LSTM neural networks. arXiv preprint arXiv:2012.01576
Parchami M, Zhu WP, Champagne B (2017) Model-based estimation of late reverberant spectral variance using modified weighted prediction error method. Speech Commun 92:100–113
Article Google Scholar
Boeddeker C, Nakatani T, Kinoshita K, Haeb-Umbach R (2020) Jointly optimal dereverberation and beamforming. In: ICASSP 2020–2020 IEEE International conference on acoustics, speech and signal processing (ICASSP), IEEE, pp 216–220
Greenewald K, Hero AO (2015) Robust kronecker product PCA for spatio-temporal covariance estimation. IEEE Trans Signal Process 63(23):6368–6378
Article MathSciNet MATH Google Scholar
Khosravy M, Gupta N, Dey N, Crespo RG (2022) Underwater IoT network by blind MIMO OFDM transceiver based on probabilistic Stone’s blind source separation. ACM Trans Sensor Netw (TOSN) 18(3):1–27
Article Google Scholar
Li C, Zhu L, Luo Z, Zhang Z, Yang Y (2022) Effective methods and performance analysis on data transmission security with blind source separation in space-based AIS. China Commun 19(4):154–165
Article Google Scholar
Ma B, Zhang T (2019) An analysis approach for multivariate vibration signals integrate HIWO/BBO optimized blind source separation with NA-MEMD. IEEE Access 7:87233–87245
Article Google Scholar
Jia Y, Xu P (2020) Convolutive blind source separation for communication signals based on the sliding Z-transform. IEEE Access 8:41213–41219
Article Google Scholar
Zhang Z, Gao H, Ma J, Wang S, Sun H (2021) Blind source separation based on quantum slime mould algorithm in impulse noise. Math Problems Eng 2021:1–17
Google Scholar
Wu B, Li K, Huang Z, Siniscalchi SM, Yang M, Lee CH (2017, March) A unified deep modeling approach to simultaneous speech dereverberation and recognition for the REVERB challenge. In: 2017 Hands-free Speech Communications and Microphone Arrays (HSCMA), pp 36–40, IEEE
Lee I, Kim T, Lee TW (2007) Fast fixed-point independent vector analysis algorithms for convolutive blind source separation. Signal Process 87(8):1859–1871
Article MATH Google Scholar
Do HD, Tran ST, Chau DT (2020) Speech source separation using variational autoencoder and bandpass filter. IEEE Access 8:156219–156231
Article Google Scholar
Nakatani T, Boeddeker C, Kinoshita K, Ikeshita R, Delcroix M, Haeb-Umbach R (2020) Jointly optimal denoising, dereverberation, and source separation. IEEE ACM Trans Audio Speech Lang Process 28:2267–2282
Article Google Scholar
Ullah R, Islam MS, Hossain MI, Wahab FE, Ye Z (2020) Single channel speech dereverberation and separation using RPCA and SNMF. Appl Acoust 167:107406
Article Google Scholar
Song S, Cheng L, Luan S, Yao D, Li J, Yan Y (2021) An integrated multi-channel approach for joint noise reduction and dereverberation. Appl Acoust 171:107526
Article Google Scholar
He R, Long Y, Li Y, Liang J (2020) Mask-based blind source separation and MVDR beamforming in ASR. Int J Speech Technol 23(1):133–140
Article Google Scholar
Tan K, Xu Y, Zhang SX, Yu M, Yu D (2020) Audio-visual speech separation and dereverberation with a two-stage multimodal network. IEEE J Select Topics Signal Process 14(3):542–553
Article Google Scholar
Khan JB, Jan T, Khalil RA, Altalbe A (2020) Hybrid source prior based independent vector analysis for blind separation of speech signals. IEEE Access 8:132871–132881
Article Google Scholar
Nugraha AA, Sekiguchi K, Fontaine M, Bando Y, Yoshii K (2020) Flow-based independent vector analysis for blind source separation. IEEE Signal Process Lett 27:2173–2177
Article Google Scholar
Togami M (2020) Joint training of deep neural networks for multi-channel dereverberation and speech source separation. In: ICASSP 2020–2020 IEEE international conference on acoustics, speech and signal processing (ICASSP), pp. 3032–3036, IEEE
Nakatani T, Takahashi R, Ochiai T, Kinoshita K, Ikeshita R, Delcroix M, Araki S (2020) DNN-supported mask-based convolutional beamforming for simultaneous denoising, dereverberation, and source separation. In: ICASSP 2020–2020 IEEE international conference on acoustics, speech and signal processing (ICASSP), pp 6399–6403, IEEE
Sekiguchi K, Bando Y, Nugraha AA, Fontaine M, Yoshii K (2021) Autoregressive fast multichannel nonnegative matrix factorization for joint blind source separation and dereverberation. In: ICASSP 2021–2021 IEEE international conference on acoustics, speech and signal processing (ICASSP), pp 511–515, IEEE
Sheeja JJ, Sankaragomathi B (2022) CNN-QTLBO: an optimal blind source separation and blind dereverberation scheme using lightweight CNN-QTLBO and PCDP-LDA for speech mixtures. Signal Image Video Process 16:1323–1331
Article Google Scholar
Bulut AE, Koishida K (2020) Low-latency single channel speech dereverberation using U-net convolutional neural networks. In: Interspeech, pp 2442–2446
Tsai TH, Liu PY, Chiou YH (2022) Hardware design for Blind source separation using a fast time-frequency mask technique. Integration 82:67–77
Article Google Scholar
Kumar M, Jayanthi VE (2020) Blind source separation using kurtosis, negentropy and maximum likelihood functions. Int J Speech Technol 23(1):13–21
Article Google Scholar
Huang L, Zhao L, Zhou Y, Zhu F, Liu L, Shao L (2020) An investigation into the stochasticity of batch whitening. In: Proceedings of the IEEE/cvf conference on computer vision and pattern recognition, pp 6439–6448
Klanke S, Vijayakumar S, Schaal S (2008) A library for locally weighted projection regression. J Mach Learn Res
Vijayakumar S, D'Souza A, Schaal S (2005) LWPR: A scalable method for incremental online learning in high dimensions
Ahsan M, Mashuri M, Kuswanto H, Prastyo DD (2018) Intrusion detection system using multivariate control chart Hotelling’s T2 based on PCA. Int J Adv Sci Eng Inf Technol 8(5):1905–1911
Article Google Scholar
Amor LB, Lahyani I, Jmaiel M (2017) PCA-based multivariate anomaly detection in mobile healthcare applications. In: 2017 IEEE/ACM 21st International symposium on distributed simulation and real time applications (DS-RT), IEEE, pp 1–8
Scheibler R (2020) Generalized minimal distortion principle for blind source separation. arXiv preprint arXiv:2009.05288
Lv Z, Zhang BB, Wu XP, Zhang C, Zhou BY (2017) A permutation algorithm based on dynamic time warping in speech frequency-domain blind source separation. Speech Commun 92:132–141
Article Google Scholar
Nakatani T, Yoshioka T, Kinoshita K, Miyoshi M, Juang BH (2010) Speech dereverberation based on variance-normalized delayed linear prediction. IEEE Trans Audio Speech Lang Process 18(7):1717–1731
Article Google Scholar
Kinoshita K, Delcroix M, Kwon H, Mori T, Nakatani T (2017) Neural Network-Based Spectrum Estimation for Online WPE Dereverberation. In: Interspeech, pp 384–388
https://www.kaggle.com/nltkdata/timitcorpus.
Mowlaee P, Saeidi R, Christensen MG, Martin R (2012) Subjective and objective quality assessment of single-channel speech separation algorithms. In: 2012 IEEE international conference on acoustics, speech and signal processing (ICASSP), IEEE, pp 69–72
Looney D, Gaubitch ND (2020) Joint estimation of acoustic parameters from single-microphone speech observations. In: ICASSP 2020–2020 IEEE international conference on acoustics, speech and signal processing (ICASSP). IEEE, pp 431–435
Dietzen T, Doclo S, Moonen M, van Waterschoot T (2020) Integrated sidelobe cancellation and linear prediction Kalman filter for joint multi-microphone speech dereverberation, interfering speech cancellation, and noise reduction. IEEE ACM Trans Audio Speech Lang Process 28:740–754
Article Google Scholar
Ibarrola FJ, Di Persia LE, Spies RD (2018) A Bayesian approach to convolutive nonnegative matrix factorization for blind speech dereverberation. Signal Process 151:89–98
Article Google Scholar
Series B (2014) Method for the subjective assessment of intermediate quality level of audio systems. Int Telecommun Union Radiocommun Assembly
Min X, Zhai G, Zhou J, Farias MC, Bovik AC (2020) Study of subjective and objective quality assessment of audio-visual signals. IEEE Trans Image Process 29:6054–6068
Article MATH Google Scholar
Huber R, Kollmeier B (2006) PEMO-Q—A new method for objective audio quality assessment using a model of auditory perception. IEEE Trans Audio Speech Lang Process 14(6):1902–1911
Article Google Scholar
Su J, Jin Z, Finkelstein A (2020) HiFi-GAN: High-fidelity denoising and dereverberation based on speech deep features in adversarial networks. arXiv preprint arXiv:2006.05694
Emiya V, Vincent E, Harlander N, Hohmann V (2011) Subjective and objective quality assessment of audio source separation. IEEE Trans Audio Speech Lang Process 19(7):2046–2057
Article Google Scholar
Prodeus A, Kotvytskyi I (2017) On reliability of log-spectral distortion measure in speech quality estimation. In: 2017 IEEE 4th International conference actual problems of unmanned aerial vehicles developments (APUAVD), pp 121–124, IEEE
Ernst O, Chazan SE, Gannot S, Goldberger J (2018) Speech dereverberation using fully convolutional networks. In: 2018 26th European Signal Processing Conference (EUSIPCO), pp 390–394, IEEE
Nathwani K, Hegde RM (2015) Joint source separation and dereverberation using constrained spectral divergence optimization. Signal Process 106:266–281
Article Google Scholar
Fu Y, Wu J, Hu Y, Xing M, Xie L (2021, January) DESNet: A multi-channel network for simultaneous speech dereverberation, enhancement and separation. In: 2021 IEEE spoken language technology workshop (SLT) pp 857–864, IEEE
Sivasankaran S, Vincent E, Illina I (2017) A combined evaluation of established and new approaches for speech recognition in varied reverberation conditions. Comput Speech Lang 46:444–460
Article Google Scholar

Download references

Author information

Authors and Affiliations

Department of ECE, Rohini College of Engineering and Technology, Palkulam, Kanyakumari, India
Jasmine J. C. Sheeja
Department of Biomedical Engineering, Sri Sakthi Institue of Engineering and Technology, Coimbatore, India
B. Sankaragomathi

Authors

Jasmine J. C. Sheeja
View author publications
You can also search for this author inPubMed Google Scholar
B. Sankaragomathi
View author publications
You can also search for this author inPubMed Google Scholar

Corresponding author

Correspondence to Jasmine J. C. Sheeja.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Human and animal rights

This article does not contain any studies with human or animal subjects performed by any of the authors.

Informed consent

Informed consent was obtained from all individual participants included in the study.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Sheeja, J.J.C., Sankaragomathi, B. Speech dereverberation and source separation using DNN-WPE and LWPR-PCA. Neural Comput & Applic 35, 7339–7356 (2023). https://doi.org/10.1007/s00521-022-07884-0

Download citation

Received: 07 April 2021
Accepted: 22 September 2022
Published: 08 January 2023
Issue Date: April 2023
DOI: https://doi.org/10.1007/s00521-022-07884-0

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Speech dereverberation and source separation using DNN-WPE and LWPR-PCA

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

CNN-QTLBO: an optimal blind source separation and blind dereverberation scheme using lightweight CNN-QTLBO and PCDP-LDA for speech mixtures

Deep learning for speech denoising with improved Wiener approach

A New Neural Beamformer for Multi-channel Speech Separation

Explore related subjects

Availability of data and material

References

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Human and animal rights

Informed consent

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now