Single-Channel Speech Enhancement Using Single Dimension Change Accelerated Particle Swarm Optimization for Subspace Partitioning

Ghorpade, Kalpana; Khaparde, Arti

doi:10.1007/s00034-023-02324-3

Single-Channel Speech Enhancement Using Single Dimension Change Accelerated Particle Swarm Optimization for Subspace Partitioning

Published: 01 March 2023

Volume 42, pages 4343–4361, (2023)
Cite this article

Circuits, Systems, and Signal Processing Aims and scope Submit manuscript

Kalpana Ghorpade¹ &
Arti Khaparde²

125 Accesses
1 Citation
1 Altmetric
Explore all metrics

Abstract

Speech signal gets contaminated by background noise affecting its quality and intelligibility. There are different sources of additive noise. This additive noise, either stationary or non-stationary, has a distinct distribution of noise energy in the frequency domain. Degraded speech affects the performance of speech-operated systems. Speech enhancement can reduce this additive noise. Here, we propose a subspace-based single-channel speech enhancement method using modified accelerated particle swarm optimization to optimize subspace partitioning. Principal components of noisy speech are partitioned into speech, speech plus noise, and noise only based on the signal-to-noise ratio of principal components. Voice activity detection is implemented to find the variance of additive noise. Modified accelerated particle swarm optimization optimizes the number of principal components in each partition and the weights of the components in each class. The proposed speech enhancement method gives better results for the quality and intelligibility measures of enhanced speech compared with conventional speech enhancement methods. We got 18.8% improvement in STOI for 0 dB restaurant noise, 20.5% improvement for 0 dB train noise, and 11.55% improvement for 0 dB exhibition noise. We got an improvement of 39.15% in PESQ for 0 dB babble noise, 41.57% for 0 dB car noise, and 31.79% increase for 0 dB airport noise. The average improvement in the segmental SNR of the enhanced speech is 8.32 dB for 0 dB noise. There is 4.4 dB improvement in SDR for the airport noise and 5.54 dB improvement for the station noise. We got this improvement with minimum speech distortion.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Comparative analysis of audio classification with MFCC and STFT features using machine learning techniques

Article Open access 03 January 2024

Speech Emotion Recognition: A Comprehensive Survey

Article 08 March 2023

Noise robust automatic speech recognition: review and analysis

Article 24 June 2023

Data Availability

The dataset analyzed during the current study is available at https://ecs.utdallas.edu/loizou/speech/noizeus/.

References

A. H. Abolhassani, S.A. Selouani, D. O’Shaughnessy, Speech enhancement using PCA and variance of the reconstruction error in distributed speech recognition. IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU), 2007
M. Anouar, B. Messaoud, A. Bouzid, N. Ellouze, Speech enhancement based on wavelet packet of an improved principal component analysis. Comput. Speech Lang. (2015). https://doi.org/10.1016/j.csl.2015.06.001
Article Google Scholar
L. Andong et al., A collaborative learning framework for single-channel speech enhancement. Appl. Acoust. 187, 108499 (2022). https://doi.org/10.1016/j.apacoust.2021.108499
Article Google Scholar
A. Aggarwal, T. Rawat, D. Upadhyay, Design of optimal digital FIR filters using evolutionary and swarm optimization techniques. AEU Int. J. Electron. Commun. 70(4), 373–385 (2016)
Article Google Scholar
S. Boll, Suppression of acoustic noise in speech using spectral subtraction. IEEE Trans. Acoust. Speech Signal Process. 27(2), 113–120 (1979)
Article Google Scholar
A.L. Badri, M. Geravanchizadeh, Speech enhancement using sexual reproduction based PSO. 10th International Conference on Information Science, Signal Processing and their Applications, 2010
S. E. Eskimez, T. Yoshioka, H. Wang, X. Wang, Z. Chen, X. Huang, Personalized speech enhancement: new models and comprehensive evaluation. ICASSP 2022—IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2022, pp. 356–360
Y. Ephraim, H.L. Van Trees, A signal subspace approach for speech enhancement. IEEE Trans. Speech Audio Process. 3, 251–266 (1995). https://doi.org/10.1109/89.397090
Article Google Scholar
M. Geravanchizadeh, S.G. Osgouei, A new shuffled sub-swarm particle swarm optimization algorithm for speech enhancement. J. Adv. Comput. Eng. Technol. 1(1), 43–50 (2015)
Google Scholar
K. Ghorpade, A. Khaparde, Single Channel Speech Enhancement using evolutionary algorithm with Log-MMSE. ASEAN Eng. J. 12, 83–91 (2022). https://doi.org/10.11113/aej.v12.16770
Article Google Scholar
T. Green et al., Speech recognition with a hearing-aid processing scheme combining beamforming with mask-informed speech enhancement. Trends Hear. (2022). https://doi.org/10.1177/23312165211068629
Article Google Scholar
Z. Huang, S. Watanabe, S.W. Yang, P. García, S. Khudanpur, Investigating Self-Supervised Learning for Speech Enhancement and Separation. ICASSP 2022 -IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2022, pp. 6837–6841
Y. Hu, P.C. Loizou, Subjective evaluation and comparison of speech enhancement algorithms. Speech Commun. 49(7–8), 588–601 (2007)
Article Google Scholar
Y. Hu, P.C. Loizou, Evaluation of objective quality measures for speech enhancement. IEEE Trans. Audio Speech Lang. Process. 16, 229–238 (2008)
Article Google Scholar
A.M. Kondaz, Digital Speech Coding for Low Bit Rate Communication Systems (Wiley, 2004)
Book Google Scholar
D.J. Krusicnski, W.K. Jenkins, Adaptive Filtering via Particle Swarm Optimization. Proceeding 37, Asilomar Conference on Signals, Systems, and Computers, 2003
R. Kar, D. Mandal, S. Mondal, S.P. Ghoshal, Craziness based Particle Swarm Optimization algorithm for FIR band stop filter design. Swarm Evol. Comput. (2012). https://doi.org/10.1016/j.swevo.2012.05.002
Article Google Scholar
J. Kennedy, R. Eberhart, Particle swarm optimization. Proc. IEEE Int. Conf. Neural Netw. 4, 1942–1948 (1995)
Article Google Scholar
P.C. Loizou, Speech Enhancement: Theory and Practice (CRC Press, 2013)
Book Google Scholar
Y. Luo, M. Yu, Single-channel speech enhancement based on multi-band spectrogram rearranged RPCA. Electron. Lett. 55(7), 415–417 (2019)
Article Google Scholar
T. Lavanya, T. Nagarajan, P. Vijayalakshmi, Multi-level single-channel speech enhancement using a unified framework for estimating magnitude and phase spectra. IEEE/ACM Trans. Audio Speech Lang. Process. 28, 1315–1327 (2020). https://doi.org/10.1109/TASLP.2020.2986877
Article Google Scholar
M.A. Messaoud, B. Aicha, Sparse representations for single channel speech enhancement based on voiced/unvoiced classification. Circuits Syst. Signal Process. 36, 1912–1933 (2017). https://doi.org/10.1007/s00034-016-0384-6
Article Google Scholar
S. Mandal, S.P. Ghoshal, R. Kar, D. Mandal, Design of optimal linear phase FIR high pass filter using craziness-based particle swarm optimization technique. J. King Saud Univ. Comput. Inf. Sci. 24(1), 83–92 (2012)
Google Scholar
K. Paliwal, B. Schwerin. Wojcicki, Single-channel speech enhancement using spectral subtraction in the short-time modulation domain. Speech Commun. 52(5), 450–475 (2010)
Article Google Scholar
K. Prajna, G.S.B. Rao, K.V.V.S. Reddy, A new dual channel speech enhancement approach based on accelerated particle swarm optimization (APSO). Int. J. Intell. Syst. Appl. 6(4), 1–10 (2014)
Google Scholar
K. Prajna, G.S.B. Rao, K.V.V.S. Reddy, U. Maheswari, A new approach to dual channel speech enhancement based on hybrid PSOGSA. Int. J. Speech Technol. 18, 45–56 (2015)
Article Google Scholar
A.W. Rix, G.J. Beerends, M.P. Hollia, Perceptual evaluation of speech quality (PESQ)-a new method for speech quality assessment of telephone networks and codecs. IEEE International Conference on Acoustic, Speech and Signal Processing proceedings (Cat. No.01CH37221), 2001
S. Roy, A. Nicolson, K. Paliwal, On supervised LPC estimation training targets for augmented Kalman filter-based speech enhancement. Speech Commun. 142, 49–60 (2022). https://doi.org/10.1016/j.specom.2022.06.004
Article Google Scholar
P.K. Rajani, A. Khaparde, Video error concealment using particle swarm optimization. Object detection by stereo vision images (Wiley, 2022), pp.73–98
Book Google Scholar
A. Saadoune et al., Perceptual subspace speech enhancement using variance of the reconstruction error. Digit. Signal Process. (2014). https://doi.org/10.1016/j.dsp.2013.09.005
Article Google Scholar
C. Sun, J. Xie, Y. Leng, A signal subspace speech enhancement approach based on joint low-rank and sparse matrix decomposition. Arch. Acoust. 41(2), 245–254 (2016)
Article Google Scholar
L. Shubo, et al. S-DCCRN: Super Wide Band DCCRN with Learnable Complex Feature for Speech Enhancement. ICASSP 2022—2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2022, pp. 7767–7771
R.S. Selvi, G.R. Suresh, Hybridization of spectral filtering with particle swarm optimization for speech signal enhancement. Int. J. Speech Technol. (2015). https://doi.org/10.1007/s10772-015-9317-1
Article Google Scholar
T.M.F. Taha, S.K. Wajid, A. Hussaain, Speech enhancement based on adaptive noise cancellation and particle swarm optimization. J. Comput. Sci. (2019). https://doi.org/10.3844/jcssp.2019.691.701
Article Google Scholar
C. Taal et al., A short-time objective intelligibility measure for time-frequency weighted noisy speech. IEEE international Conference on Acoustics, Speech and Signal Processing, 2010
R. Vetter, et al. Single channel speech enhancement using principal component analysis and MDL subspace section. Proceedings of 6^th European Conference on Speech Communication and Technology (EUROSPEECH’99), 1999
E. Vincent, R. Gribonval, C. Févotte, Performance measurement in blind audio source separation. IEEE Trans. Audio Speech Lang. Process. 14(4), 1462–1469 (2006)
Article Google Scholar
H. Yue, W. Duo, X. Peng, J. Yang, Reference-based speech enhancement via feature alignment and fusion network. Proc. AAAI Conf. Artif. Intell. 36(10), 11648–11656 (2022). https://doi.org/10.1609/aaai.v36i10.21419
Article Google Scholar
X.S. Yang, Nature-Inspired Metaheuristic Algorithms (Luniver Press, 2008)
Google Scholar
X.S. Yang, S. Deb, S. Fong, Accelerated particle swarm optimization and support vector machine for business optimization and applications networked digital technologies (NDT2011). Commun. Comput. Inf. Sci. (2011). https://doi.org/10.1007/978-3-642-22185-9_62011
Article Google Scholar
L. Zadeh, Frequency analysis of variable networks. Proc. IRE (1950). https://doi.org/10.1109/JRPROC.1950.231083
Article Google Scholar
C. Zheng, X. Peng, Y. Zhang, S. Srinivasan, Y. Lu, Interactive Speech and Noise Modeling for Speech Enhancement. Proc. AAAI Conf. Artif. Intell. 35(16), 14549–14557 (2021). https://doi.org/10.1609/aaai.v35i16.17710
Article Google Scholar

Download references

Funding

This research did not receive any specific grant from funding agencies in the public, commercial, or not-for-profit sectors.

Author information

Authors and Affiliations

Department of Electronics and Telecommunication, Cummins College of Engineering for Women, Pune, Maharashtra, India
Kalpana Ghorpade
Department of ECE, MIT World Peace University, Pune, Maharashtra, India
Arti Khaparde

Authors

Kalpana Ghorpade
View author publications
You can also search for this author in PubMed Google Scholar
Arti Khaparde
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Kalpana Ghorpade.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Consent to Participate

All authors have approved the manuscript and agreed with the submission.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Ghorpade, K., Khaparde, A. Single-Channel Speech Enhancement Using Single Dimension Change Accelerated Particle Swarm Optimization for Subspace Partitioning. Circuits Syst Signal Process 42, 4343–4361 (2023). https://doi.org/10.1007/s00034-023-02324-3

Download citation

Received: 08 July 2022
Revised: 08 February 2023
Accepted: 09 February 2023
Published: 01 March 2023
Issue Date: July 2023
DOI: https://doi.org/10.1007/s00034-023-02324-3

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Single-Channel Speech Enhancement Using Single Dimension Change Accelerated Particle Swarm Optimization for Subspace Partitioning

Abstract

Access this article

Similar content being viewed by others

Comparative analysis of audio classification with MFCC and STFT features using machine learning techniques

Speech Emotion Recognition: A Comprehensive Survey

Noise robust automatic speech recognition: review and analysis

Data Availability

References

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Consent to Participate

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Single-Channel Speech Enhancement Using Single Dimension Change Accelerated Particle Swarm Optimization for Subspace Partitioning

Abstract

Access this article

Similar content being viewed by others

Comparative analysis of audio classification with MFCC and STFT features using machine learning techniques

Speech Emotion Recognition: A Comprehensive Survey

Noise robust automatic speech recognition: review and analysis

Data Availability

References

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Consent to Participate

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation