Skip to main content
Log in

DCU-Net transient noise suppression based on joint spectrum estimation

  • Original Paper
  • Published:
Signal, Image and Video Processing Aims and scope Submit manuscript

Abstract

Transient noise has a high short-time energy, a high degree of randomness, a wide frequency-domain distribution, and only causes local signal pollution. Traditional denoising methods usually establish the assumption of a certain kind of relationship between speech and noise, and this assumption does not necessarily match real-life scenarios. Therefore, using traditional denoising methods does not effectively suppress transient noise. For the above reasons, this paper proposes a new denoising scheme. First, based on the conventional optimally-modified log-spectral amplitude (OM-LSA) estimation algorithm, the minima controlled recursive averaging algorithm is replaced by the improved mean recurrence time algorithm, and the transient noise spectrum is estimated. Second, transient noise segments are determined using thresholds and fed into a deep complex-valued U-Net (DCU-Net) network for speech enhancement. Third, insert the enhanced results into the original sequence to reconstruct the denoised speech signal. Finally, this paper uses the Voice Bank corpus speech and homemade noise datasets to perform experimental tests. The test results show that the segmented signal-to-noise ratio, speech quality perception, and short-term target intelligibility of the proposed method in 0 dB, − 5 dB, and − 10 dB environments have improved than the traditional OM-LSA algorithm. When the signal-to-noise ratio is − 10 dB, the segmented signal-to-noise ratio is improved by 9.8%. The test results show that this paper's proposed method can solidly suppress transient noise at low signal-to-noise ratios and simultaneously improve speech quality.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

Similar content being viewed by others

Availability of data and materials

All the data included in this study are available upon request by contacting the corresponding author.

Not applicable.

References

  1. Cao, K., Wang, M.: Transient noise suppression algorithm in speech system. AIP Conf. Proc. 1864(1), 20006–20006 (2017)

    Article  Google Scholar 

  2. Vaseghi, S.V.: Advanced Digital Signal Processing and Noise Reduction. Wiley, New York (2000)

    Google Scholar 

  3. Kammi, S., Mollaei, M.: A novel regularization framework for transient noise reduction. Appl. Acoust. 129, 135–143 (2018)

    Article  Google Scholar 

  4. Fennick, J.H.: A report on some characteristics of impulse noise in telephone communication. IEEE Trans. Commun. 83(75), 700–705 (1964)

    Article  Google Scholar 

  5. Boll, S.F.: A spectral subtraction algorithm for suppression of acoustic noise in speech. In: ICASSP'79. IEEE International Conference on Acoustics, Speech, and Signal Process, pp. 200–203 (1979)

  6. Richards, D.S.: VLSI median filters. IEEE Trans. Acoust. Speech Signal Process. ASSP 38(1), 145–153 (1990)

    Article  Google Scholar 

  7. Choi, M.S., Kang, H.G.: Transient noise reduction in speech signal with a modified long-term predictor. EURASIP J. Adv. Signal Processing. 2011(1), 1–9 (2011)

    Article  MathSciNet  Google Scholar 

  8. Wang, J., Zhang, X., Zhu, J., Wu, Y.: Impulsive noise suppression based on time-frequency spectrogram. J. Vib. Shock. 29(2), 149–153 (2010)

    Google Scholar 

  9. Tanwar, P., Somkuwar, A.: Hard component detection of transient noise and its removal using empirical mode decomposition and wavelet-based predictive filter. IET Signal Process. 12(7), 907–916 (2018)

    Article  Google Scholar 

  10. Nongpiur, R.C.: Impulse noise removal in speech using wavelets. In: IEEE International Conference on Acoustics, Speech and Signal Process, pp. 1593–1596 (2008)

  11. He, Z., Zhu, Z., Zhang, M.: Impulsive noise removal based on noise energy distribution in wavelet packet domain. Chin. J. Sci. Instrum. 32(9), 2071–2078 (2011)

    Google Scholar 

  12. Ram, R., Mohapatra, S.K., Nayak, P.K., Mohanty, M.N.: Single Channel Speech Enhancement Using Fractional Wavelet Transform Advances in Intelligent Computing and Communication, pp. 629–643. Springer, Singapore (2021)

    Google Scholar 

  13. Hirszhorn, A., Dov, D., Talmon, R., Cohen, I.: Transient interference suppression in speech signals based on the OM-LSA algorithm. International Workshop on Acoustic Signal Enhancement, VDE, pp. 1–4 (2012)

  14. Talmon, R., Cohen, I., Gannot, S.: Clustering and suppression of transient noise in speech signals using diffusion maps. In: IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5084–5087 (2011)

  15. Ullah, R., Islam, M.S., Ye, Z., Asif, M.: Semi-supervised transient noise suppression using omlsa and snmf algorithms. Appl. Acoust. 170, 107533 (2020)

    Article  Google Scholar 

  16. Hao, Y., Cheng, S., Chen, G., Chen, Y., Ruan, L.: A neural network based noise suppression method for transient noise control with low-complexity computation. INTER-NOISE NOISE-CON Congr. Conf. Proc. 263(1), 5902–5909 (2021)

    Article  Google Scholar 

  17. Choi, H.S., Kim, J.H., Huh, J., Kim, A., Ha, J.W., Lee, K.: Phase-aware speech enhancement with deep complex u-net. In: International Conference on Learning Representations (2018)

  18. Choi, H.S., Heo, H., Lee, J.H., Lee, K.: Phase-aware single-stage speech denoising and dereverberation with u-net. arXiv:2006.00687, (2020)

  19. Rajamani, K.T., Rani, P., Siebert, H., ElagiriRamalingam, R., Heinrich, M.P.: Attention-augmented U-Net (AA-U-Net) for semantic segmentation. Signal Image Video Process. pp. 1–9 (2022)

  20. Liang, R., Xie, Y., Cheng, J., Tang, G., Sun, S.: Real-time speech enhancement algorithm for transient noise suppression. Multimed. Tools Appl. 80(3), 3681–3702 (2021)

    Article  Google Scholar 

  21. Williamson, D.S., Wang, Y., Wang, D.L.: Complex ratio masking for monaural speech separation. IEEE/ACM Trans. Audio Speech Lang. Process. 24(3), 483–492 (2016)

    Article  Google Scholar 

  22. Chen, R., Xue, J., Chen, D.: Transient noise suppression algorithm based on deep learning. Commun. Electroacoust. 44(06), 107–110 (2020)

    Google Scholar 

  23. Ronneberger, O., Fischer, P., Brox, T.: U-net: Convolutional networks for biomedical image segmentation. In: International Conference on Medical image computing and computer-assisted intervention. Springer, Cham, pp. 234–241 (2015)

  24. Wang, M., Lin, F., Chen, K., Luo, W., Qiang, S.: TEM-NLnet: a deep denoising network for transient electromagnetic signal with noise learning. IEEE Trans. Geosci. Remote Sens. 60, 1–14 (2022)

    Google Scholar 

  25. Venkataramani, S., Casebeer, J., Smaragdis, P.: End-to-end Source Separation with Adaptive Front-Ends. (2017).

Download references

Funding

This research was received by the Natural Science Foundation of Heilongjiang Province (No. LH2020F033), the National Natural Science Youth Foundation of China (No.11804068), and Research Project of the Heilongjiang Province Health Commission(20221111001069).

Author information

Authors and Affiliations

Authors

Contributions

CL contributed to the conception of the study and contributed significantly to analysis and manuscript preparation; LZ and SZ made important contributions in making adjustments to the structure, revising the paper, English editing, and revisions of this manuscript; ZS performed the experiment, the data analyses, and wrote the original manuscript; HC made significant contributions to the structure, English modification, and data analysis of the paper; MZ carried on the data analysis and did the major revision of the paper; and XG and CH made important contributions in making adjustments to the proofread English.

Corresponding authors

Correspondence to Lei Zhang, Chuang Han or Meng Zhang.

Ethics declarations

Conflicts of interest

The authors declare that there is no conflict of interest regarding the publication of this article.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Lan, C., Zhao, S., Zhang, L. et al. DCU-Net transient noise suppression based on joint spectrum estimation. SIViP 17, 3265–3273 (2023). https://doi.org/10.1007/s11760-023-02541-y

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11760-023-02541-y

Keywords

Navigation