Skip to main content
Log in

Single-Channel Speech Enhancement Using Single Dimension Change Accelerated Particle Swarm Optimization for Subspace Partitioning

  • Published:
Circuits, Systems, and Signal Processing Aims and scope Submit manuscript

Abstract

Speech signal gets contaminated by background noise affecting its quality and intelligibility. There are different sources of additive noise. This additive noise, either stationary or non-stationary, has a distinct distribution of noise energy in the frequency domain. Degraded speech affects the performance of speech-operated systems. Speech enhancement can reduce this additive noise. Here, we propose a subspace-based single-channel speech enhancement method using modified accelerated particle swarm optimization to optimize subspace partitioning. Principal components of noisy speech are partitioned into speech, speech plus noise, and noise only based on the signal-to-noise ratio of principal components. Voice activity detection is implemented to find the variance of additive noise. Modified accelerated particle swarm optimization optimizes the number of principal components in each partition and the weights of the components in each class. The proposed speech enhancement method gives better results for the quality and intelligibility measures of enhanced speech compared with conventional speech enhancement methods. We got 18.8% improvement in STOI for 0 dB restaurant noise, 20.5% improvement for 0 dB train noise, and 11.55% improvement for 0 dB exhibition noise. We got an improvement of 39.15% in PESQ for 0 dB babble noise, 41.57% for 0 dB car noise, and 31.79% increase for 0 dB airport noise. The average improvement in the segmental SNR of the enhanced speech is 8.32 dB for 0 dB noise. There is 4.4 dB improvement in SDR for the airport noise and 5.54 dB improvement for the station noise. We got this improvement with minimum speech distortion.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7

Similar content being viewed by others

Data Availability

The dataset analyzed during the current study is available at https://ecs.utdallas.edu/loizou/speech/noizeus/.

References

  1. A. H. Abolhassani, S.A. Selouani, D. O’Shaughnessy, Speech enhancement using PCA and variance of the reconstruction error in distributed speech recognition. IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU), 2007

  2. M. Anouar, B. Messaoud, A. Bouzid, N. Ellouze, Speech enhancement based on wavelet packet of an improved principal component analysis. Comput. Speech Lang. (2015). https://doi.org/10.1016/j.csl.2015.06.001

    Article  Google Scholar 

  3. L. Andong et al., A collaborative learning framework for single-channel speech enhancement. Appl. Acoust. 187, 108499 (2022). https://doi.org/10.1016/j.apacoust.2021.108499

    Article  Google Scholar 

  4. A. Aggarwal, T. Rawat, D. Upadhyay, Design of optimal digital FIR filters using evolutionary and swarm optimization techniques. AEU Int. J. Electron. Commun. 70(4), 373–385 (2016)

    Article  Google Scholar 

  5. S. Boll, Suppression of acoustic noise in speech using spectral subtraction. IEEE Trans. Acoust. Speech Signal Process. 27(2), 113–120 (1979)

    Article  Google Scholar 

  6. A.L. Badri, M. Geravanchizadeh, Speech enhancement using sexual reproduction based PSO. 10th International Conference on Information Science, Signal Processing and their Applications, 2010

  7. S. E. Eskimez, T. Yoshioka, H. Wang, X. Wang, Z. Chen, X. Huang, Personalized speech enhancement: new models and comprehensive evaluation. ICASSP 2022IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2022, pp. 356–360

  8. Y. Ephraim, H.L. Van Trees, A signal subspace approach for speech enhancement. IEEE Trans. Speech Audio Process. 3, 251–266 (1995). https://doi.org/10.1109/89.397090

    Article  Google Scholar 

  9. M. Geravanchizadeh, S.G. Osgouei, A new shuffled sub-swarm particle swarm optimization algorithm for speech enhancement. J. Adv. Comput. Eng. Technol. 1(1), 43–50 (2015)

    Google Scholar 

  10. K. Ghorpade, A. Khaparde, Single Channel Speech Enhancement using evolutionary algorithm with Log-MMSE. ASEAN Eng. J. 12, 83–91 (2022). https://doi.org/10.11113/aej.v12.16770

    Article  Google Scholar 

  11. T. Green et al., Speech recognition with a hearing-aid processing scheme combining beamforming with mask-informed speech enhancement. Trends Hear. (2022). https://doi.org/10.1177/23312165211068629

    Article  Google Scholar 

  12. Z. Huang, S. Watanabe, S.W. Yang, P. García, S. Khudanpur, Investigating Self-Supervised Learning for Speech Enhancement and Separation. ICASSP 2022 -IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2022, pp. 6837–6841

  13. Y. Hu, P.C. Loizou, Subjective evaluation and comparison of speech enhancement algorithms. Speech Commun. 49(7–8), 588–601 (2007)

    Article  Google Scholar 

  14. Y. Hu, P.C. Loizou, Evaluation of objective quality measures for speech enhancement. IEEE Trans. Audio Speech Lang. Process. 16, 229–238 (2008)

    Article  Google Scholar 

  15. A.M. Kondaz, Digital Speech Coding for Low Bit Rate Communication Systems (Wiley, 2004)

    Book  Google Scholar 

  16. D.J. Krusicnski, W.K. Jenkins, Adaptive Filtering via Particle Swarm Optimization. Proceeding 37, Asilomar Conference on Signals, Systems, and Computers, 2003

  17. R. Kar, D. Mandal, S. Mondal, S.P. Ghoshal, Craziness based Particle Swarm Optimization algorithm for FIR band stop filter design. Swarm Evol. Comput. (2012). https://doi.org/10.1016/j.swevo.2012.05.002

    Article  Google Scholar 

  18. J. Kennedy, R. Eberhart, Particle swarm optimization. Proc. IEEE Int. Conf. Neural Netw. 4, 1942–1948 (1995)

    Article  Google Scholar 

  19. P.C. Loizou, Speech Enhancement: Theory and Practice (CRC Press, 2013)

    Book  Google Scholar 

  20. Y. Luo, M. Yu, Single-channel speech enhancement based on multi-band spectrogram rearranged RPCA. Electron. Lett. 55(7), 415–417 (2019)

    Article  Google Scholar 

  21. T. Lavanya, T. Nagarajan, P. Vijayalakshmi, Multi-level single-channel speech enhancement using a unified framework for estimating magnitude and phase spectra. IEEE/ACM Trans. Audio Speech Lang. Process. 28, 1315–1327 (2020). https://doi.org/10.1109/TASLP.2020.2986877

    Article  Google Scholar 

  22. M.A. Messaoud, B. Aicha, Sparse representations for single channel speech enhancement based on voiced/unvoiced classification. Circuits Syst. Signal Process. 36, 1912–1933 (2017). https://doi.org/10.1007/s00034-016-0384-6

    Article  Google Scholar 

  23. S. Mandal, S.P. Ghoshal, R. Kar, D. Mandal, Design of optimal linear phase FIR high pass filter using craziness-based particle swarm optimization technique. J. King Saud Univ. Comput. Inf. Sci. 24(1), 83–92 (2012)

    Google Scholar 

  24. K. Paliwal, B. Schwerin. Wojcicki, Single-channel speech enhancement using spectral subtraction in the short-time modulation domain. Speech Commun. 52(5), 450–475 (2010)

    Article  Google Scholar 

  25. K. Prajna, G.S.B. Rao, K.V.V.S. Reddy, A new dual channel speech enhancement approach based on accelerated particle swarm optimization (APSO). Int. J. Intell. Syst. Appl. 6(4), 1–10 (2014)

    Google Scholar 

  26. K. Prajna, G.S.B. Rao, K.V.V.S. Reddy, U. Maheswari, A new approach to dual channel speech enhancement based on hybrid PSOGSA. Int. J. Speech Technol. 18, 45–56 (2015)

    Article  Google Scholar 

  27. A.W. Rix, G.J. Beerends, M.P. Hollia, Perceptual evaluation of speech quality (PESQ)-a new method for speech quality assessment of telephone networks and codecs. IEEE International Conference on Acoustic, Speech and Signal Processing proceedings (Cat. No.01CH37221), 2001

  28. S. Roy, A. Nicolson, K. Paliwal, On supervised LPC estimation training targets for augmented Kalman filter-based speech enhancement. Speech Commun. 142, 49–60 (2022). https://doi.org/10.1016/j.specom.2022.06.004

    Article  Google Scholar 

  29. P.K. Rajani, A. Khaparde, Video error concealment using particle swarm optimization. Object detection by stereo vision images (Wiley, 2022), pp.73–98

    Book  Google Scholar 

  30. A. Saadoune et al., Perceptual subspace speech enhancement using variance of the reconstruction error. Digit. Signal Process. (2014). https://doi.org/10.1016/j.dsp.2013.09.005

    Article  Google Scholar 

  31. C. Sun, J. Xie, Y. Leng, A signal subspace speech enhancement approach based on joint low-rank and sparse matrix decomposition. Arch. Acoust. 41(2), 245–254 (2016)

    Article  Google Scholar 

  32. L. Shubo, et al. S-DCCRN: Super Wide Band DCCRN with Learnable Complex Feature for Speech Enhancement. ICASSP 2022—2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2022, pp. 7767–7771

  33. R.S. Selvi, G.R. Suresh, Hybridization of spectral filtering with particle swarm optimization for speech signal enhancement. Int. J. Speech Technol. (2015). https://doi.org/10.1007/s10772-015-9317-1

    Article  Google Scholar 

  34. T.M.F. Taha, S.K. Wajid, A. Hussaain, Speech enhancement based on adaptive noise cancellation and particle swarm optimization. J. Comput. Sci. (2019). https://doi.org/10.3844/jcssp.2019.691.701

    Article  Google Scholar 

  35. C. Taal et al., A short-time objective intelligibility measure for time-frequency weighted noisy speech. IEEE international Conference on Acoustics, Speech and Signal Processing, 2010

  36. R. Vetter, et al. Single channel speech enhancement using principal component analysis and MDL subspace section. Proceedings of 6th European Conference on Speech Communication and Technology (EUROSPEECH’99), 1999

  37. E. Vincent, R. Gribonval, C. Févotte, Performance measurement in blind audio source separation. IEEE Trans. Audio Speech Lang. Process. 14(4), 1462–1469 (2006)

    Article  Google Scholar 

  38. H. Yue, W. Duo, X. Peng, J. Yang, Reference-based speech enhancement via feature alignment and fusion network. Proc. AAAI Conf. Artif. Intell. 36(10), 11648–11656 (2022). https://doi.org/10.1609/aaai.v36i10.21419

    Article  Google Scholar 

  39. X.S. Yang, Nature-Inspired Metaheuristic Algorithms (Luniver Press, 2008)

    Google Scholar 

  40. X.S. Yang, S. Deb, S. Fong, Accelerated particle swarm optimization and support vector machine for business optimization and applications networked digital technologies (NDT2011). Commun. Comput. Inf. Sci. (2011). https://doi.org/10.1007/978-3-642-22185-9_62011

    Article  Google Scholar 

  41. L. Zadeh, Frequency analysis of variable networks. Proc. IRE (1950). https://doi.org/10.1109/JRPROC.1950.231083

    Article  Google Scholar 

  42. C. Zheng, X. Peng, Y. Zhang, S. Srinivasan, Y. Lu, Interactive Speech and Noise Modeling for Speech Enhancement. Proc. AAAI Conf. Artif. Intell. 35(16), 14549–14557 (2021). https://doi.org/10.1609/aaai.v35i16.17710

    Article  Google Scholar 

Download references

Funding

This research did not receive any specific grant from funding agencies in the public, commercial, or not-for-profit sectors.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Kalpana Ghorpade.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Consent to Participate

All authors have approved the manuscript and agreed with the submission.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Ghorpade, K., Khaparde, A. Single-Channel Speech Enhancement Using Single Dimension Change Accelerated Particle Swarm Optimization for Subspace Partitioning. Circuits Syst Signal Process 42, 4343–4361 (2023). https://doi.org/10.1007/s00034-023-02324-3

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00034-023-02324-3

Keywords

Navigation