Abstract
Speech signal gets contaminated by background noise affecting its quality and intelligibility. There are different sources of additive noise. This additive noise, either stationary or non-stationary, has a distinct distribution of noise energy in the frequency domain. Degraded speech affects the performance of speech-operated systems. Speech enhancement can reduce this additive noise. Here, we propose a subspace-based single-channel speech enhancement method using modified accelerated particle swarm optimization to optimize subspace partitioning. Principal components of noisy speech are partitioned into speech, speech plus noise, and noise only based on the signal-to-noise ratio of principal components. Voice activity detection is implemented to find the variance of additive noise. Modified accelerated particle swarm optimization optimizes the number of principal components in each partition and the weights of the components in each class. The proposed speech enhancement method gives better results for the quality and intelligibility measures of enhanced speech compared with conventional speech enhancement methods. We got 18.8% improvement in STOI for 0 dB restaurant noise, 20.5% improvement for 0 dB train noise, and 11.55% improvement for 0 dB exhibition noise. We got an improvement of 39.15% in PESQ for 0 dB babble noise, 41.57% for 0 dB car noise, and 31.79% increase for 0 dB airport noise. The average improvement in the segmental SNR of the enhanced speech is 8.32 dB for 0 dB noise. There is 4.4 dB improvement in SDR for the airport noise and 5.54 dB improvement for the station noise. We got this improvement with minimum speech distortion.
Similar content being viewed by others
Data Availability
The dataset analyzed during the current study is available at https://ecs.utdallas.edu/loizou/speech/noizeus/.
References
A. H. Abolhassani, S.A. Selouani, D. O’Shaughnessy, Speech enhancement using PCA and variance of the reconstruction error in distributed speech recognition. IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU), 2007
M. Anouar, B. Messaoud, A. Bouzid, N. Ellouze, Speech enhancement based on wavelet packet of an improved principal component analysis. Comput. Speech Lang. (2015). https://doi.org/10.1016/j.csl.2015.06.001
L. Andong et al., A collaborative learning framework for single-channel speech enhancement. Appl. Acoust. 187, 108499 (2022). https://doi.org/10.1016/j.apacoust.2021.108499
A. Aggarwal, T. Rawat, D. Upadhyay, Design of optimal digital FIR filters using evolutionary and swarm optimization techniques. AEU Int. J. Electron. Commun. 70(4), 373–385 (2016)
S. Boll, Suppression of acoustic noise in speech using spectral subtraction. IEEE Trans. Acoust. Speech Signal Process. 27(2), 113–120 (1979)
A.L. Badri, M. Geravanchizadeh, Speech enhancement using sexual reproduction based PSO. 10th International Conference on Information Science, Signal Processing and their Applications, 2010
S. E. Eskimez, T. Yoshioka, H. Wang, X. Wang, Z. Chen, X. Huang, Personalized speech enhancement: new models and comprehensive evaluation. ICASSP 2022—IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2022, pp. 356–360
Y. Ephraim, H.L. Van Trees, A signal subspace approach for speech enhancement. IEEE Trans. Speech Audio Process. 3, 251–266 (1995). https://doi.org/10.1109/89.397090
M. Geravanchizadeh, S.G. Osgouei, A new shuffled sub-swarm particle swarm optimization algorithm for speech enhancement. J. Adv. Comput. Eng. Technol. 1(1), 43–50 (2015)
K. Ghorpade, A. Khaparde, Single Channel Speech Enhancement using evolutionary algorithm with Log-MMSE. ASEAN Eng. J. 12, 83–91 (2022). https://doi.org/10.11113/aej.v12.16770
T. Green et al., Speech recognition with a hearing-aid processing scheme combining beamforming with mask-informed speech enhancement. Trends Hear. (2022). https://doi.org/10.1177/23312165211068629
Z. Huang, S. Watanabe, S.W. Yang, P. García, S. Khudanpur, Investigating Self-Supervised Learning for Speech Enhancement and Separation. ICASSP 2022 -IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2022, pp. 6837–6841
Y. Hu, P.C. Loizou, Subjective evaluation and comparison of speech enhancement algorithms. Speech Commun. 49(7–8), 588–601 (2007)
Y. Hu, P.C. Loizou, Evaluation of objective quality measures for speech enhancement. IEEE Trans. Audio Speech Lang. Process. 16, 229–238 (2008)
A.M. Kondaz, Digital Speech Coding for Low Bit Rate Communication Systems (Wiley, 2004)
D.J. Krusicnski, W.K. Jenkins, Adaptive Filtering via Particle Swarm Optimization. Proceeding 37, Asilomar Conference on Signals, Systems, and Computers, 2003
R. Kar, D. Mandal, S. Mondal, S.P. Ghoshal, Craziness based Particle Swarm Optimization algorithm for FIR band stop filter design. Swarm Evol. Comput. (2012). https://doi.org/10.1016/j.swevo.2012.05.002
J. Kennedy, R. Eberhart, Particle swarm optimization. Proc. IEEE Int. Conf. Neural Netw. 4, 1942–1948 (1995)
P.C. Loizou, Speech Enhancement: Theory and Practice (CRC Press, 2013)
Y. Luo, M. Yu, Single-channel speech enhancement based on multi-band spectrogram rearranged RPCA. Electron. Lett. 55(7), 415–417 (2019)
T. Lavanya, T. Nagarajan, P. Vijayalakshmi, Multi-level single-channel speech enhancement using a unified framework for estimating magnitude and phase spectra. IEEE/ACM Trans. Audio Speech Lang. Process. 28, 1315–1327 (2020). https://doi.org/10.1109/TASLP.2020.2986877
M.A. Messaoud, B. Aicha, Sparse representations for single channel speech enhancement based on voiced/unvoiced classification. Circuits Syst. Signal Process. 36, 1912–1933 (2017). https://doi.org/10.1007/s00034-016-0384-6
S. Mandal, S.P. Ghoshal, R. Kar, D. Mandal, Design of optimal linear phase FIR high pass filter using craziness-based particle swarm optimization technique. J. King Saud Univ. Comput. Inf. Sci. 24(1), 83–92 (2012)
K. Paliwal, B. Schwerin. Wojcicki, Single-channel speech enhancement using spectral subtraction in the short-time modulation domain. Speech Commun. 52(5), 450–475 (2010)
K. Prajna, G.S.B. Rao, K.V.V.S. Reddy, A new dual channel speech enhancement approach based on accelerated particle swarm optimization (APSO). Int. J. Intell. Syst. Appl. 6(4), 1–10 (2014)
K. Prajna, G.S.B. Rao, K.V.V.S. Reddy, U. Maheswari, A new approach to dual channel speech enhancement based on hybrid PSOGSA. Int. J. Speech Technol. 18, 45–56 (2015)
A.W. Rix, G.J. Beerends, M.P. Hollia, Perceptual evaluation of speech quality (PESQ)-a new method for speech quality assessment of telephone networks and codecs. IEEE International Conference on Acoustic, Speech and Signal Processing proceedings (Cat. No.01CH37221), 2001
S. Roy, A. Nicolson, K. Paliwal, On supervised LPC estimation training targets for augmented Kalman filter-based speech enhancement. Speech Commun. 142, 49–60 (2022). https://doi.org/10.1016/j.specom.2022.06.004
P.K. Rajani, A. Khaparde, Video error concealment using particle swarm optimization. Object detection by stereo vision images (Wiley, 2022), pp.73–98
A. Saadoune et al., Perceptual subspace speech enhancement using variance of the reconstruction error. Digit. Signal Process. (2014). https://doi.org/10.1016/j.dsp.2013.09.005
C. Sun, J. Xie, Y. Leng, A signal subspace speech enhancement approach based on joint low-rank and sparse matrix decomposition. Arch. Acoust. 41(2), 245–254 (2016)
L. Shubo, et al. S-DCCRN: Super Wide Band DCCRN with Learnable Complex Feature for Speech Enhancement. ICASSP 2022—2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2022, pp. 7767–7771
R.S. Selvi, G.R. Suresh, Hybridization of spectral filtering with particle swarm optimization for speech signal enhancement. Int. J. Speech Technol. (2015). https://doi.org/10.1007/s10772-015-9317-1
T.M.F. Taha, S.K. Wajid, A. Hussaain, Speech enhancement based on adaptive noise cancellation and particle swarm optimization. J. Comput. Sci. (2019). https://doi.org/10.3844/jcssp.2019.691.701
C. Taal et al., A short-time objective intelligibility measure for time-frequency weighted noisy speech. IEEE international Conference on Acoustics, Speech and Signal Processing, 2010
R. Vetter, et al. Single channel speech enhancement using principal component analysis and MDL subspace section. Proceedings of 6th European Conference on Speech Communication and Technology (EUROSPEECH’99), 1999
E. Vincent, R. Gribonval, C. Févotte, Performance measurement in blind audio source separation. IEEE Trans. Audio Speech Lang. Process. 14(4), 1462–1469 (2006)
H. Yue, W. Duo, X. Peng, J. Yang, Reference-based speech enhancement via feature alignment and fusion network. Proc. AAAI Conf. Artif. Intell. 36(10), 11648–11656 (2022). https://doi.org/10.1609/aaai.v36i10.21419
X.S. Yang, Nature-Inspired Metaheuristic Algorithms (Luniver Press, 2008)
X.S. Yang, S. Deb, S. Fong, Accelerated particle swarm optimization and support vector machine for business optimization and applications networked digital technologies (NDT2011). Commun. Comput. Inf. Sci. (2011). https://doi.org/10.1007/978-3-642-22185-9_62011
L. Zadeh, Frequency analysis of variable networks. Proc. IRE (1950). https://doi.org/10.1109/JRPROC.1950.231083
C. Zheng, X. Peng, Y. Zhang, S. Srinivasan, Y. Lu, Interactive Speech and Noise Modeling for Speech Enhancement. Proc. AAAI Conf. Artif. Intell. 35(16), 14549–14557 (2021). https://doi.org/10.1609/aaai.v35i16.17710
Funding
This research did not receive any specific grant from funding agencies in the public, commercial, or not-for-profit sectors.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that they have no conflict of interest.
Consent to Participate
All authors have approved the manuscript and agreed with the submission.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Ghorpade, K., Khaparde, A. Single-Channel Speech Enhancement Using Single Dimension Change Accelerated Particle Swarm Optimization for Subspace Partitioning. Circuits Syst Signal Process 42, 4343–4361 (2023). https://doi.org/10.1007/s00034-023-02324-3
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00034-023-02324-3