Skip to main content
Log in

Speech Enhancement Based on Binaural Sound Source Localization and Cosh Measure Wiener Filtering

  • Published:
Circuits, Systems, and Signal Processing Aims and scope Submit manuscript

Abstract

The existing speech enhancement algorithm has shown poor performance under low Signal Noise Ratios (SNRs). To resolve this problem, a speech enhancement algorithm based on binaural sound source localization and cosh measure filtering is proposed. Firstly, the algorithm uses a sound source localization algorithm based on head correlation functions and two-level deep learning to extract the spatial information of the binaural sound source and determine the spatial position of the sound source. The beamforming method is then used to remove the noises in different directions from the speech. Finally, the Wiener filtering of cosh measure based on logarithmic relation is used to remove the noise in the same direction as the speech to achieve speech enhancement. Experiments show that the proposed algorithm has better robustness and denoising ability than the contrast algorithms.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

Similar content being viewed by others

References

  1. D. Ayllón, R. Gil−Pita, M. Rosa−Zurera, A machine learning approach for computationally and energy efficient speech enhancement in binaural hearing aids,” IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp.6515−6519 (2016). https://doi.org/10.1109/ICASSP.2016.7472932

  2. Y. Bengio, Learning deep architectures for AI[J]. Found. Trends Mach. Learn. 2(1), 1–127 (2009)

    Article  MathSciNet  Google Scholar 

  3. S. Doclo, M. Moonen, T. Van den Bogaert, Reduced−bandwidth and distributed mwf−based noise reduction algorithms for binaural hearing aids. IEEE Trans. Audio, Speech, Lang. Process. (2009). https://doi.org/10.1109/TASL.2008.2004291

    Article  Google Scholar 

  4. W. Dongxia, Z. Jiachao, F. Zhenwei et al., Broadband beamforming for speech enhancement in reverberation environment. Comput. Eng. Appl. 48(34), 136–139 (2012)

    Google Scholar 

  5. Y. Fang, F. Haihong, C. Youyuan, A binaural speech enhancement algorithm: Application to background and directional noise fields”. Int. Congress Image Signal Process. (CISP) (2015). https://doi.org/10.1109/CISP.2015.7408075

    Article  Google Scholar 

  6. M. Geravanchizadeh, S. Ghalami Osgouei, Dual−channel speech enhancement using normalized fractional least−mean−squares algorithm[C]. Iranian Conference on Electrical Engineering (2011)

  7. A. Gore, S. Chakrabartty, A min−max optimization framework for designing learners: theory and hardware[j] circuits and systems i: regular papers. IEEE Trans. 57(3), 604–617 (2010). https://doi.org/10.1109/TCSI.2009.2025002

    Article  MathSciNet  Google Scholar 

  8. A. Gray, J. Markel, Distance measures for speech processing [J] IEEE Trans. Acoust Speech Signal Process. ASSP 24(5), 380–391 (1976). https://doi.org/10.1109/TASSP.1976.1162849

    Article  Google Scholar 

  9. J. Hansen, B. Pellom, An efficient quality evaluation protocol for speech enhancement algorithms[C]. Int. Conf. Spoken Lang. Process. 7, 2819–2822 (1998)

    Google Scholar 

  10. ITU. ITU−T Recommendation p.862, Perceptual evaluation of speech quality(PESQ), an objective method for end−to−end speech quality assessment of narrowband telephone networks and speech codes[S](2000)

  11. R. Li, D. Pan, S. Zhang, Speech enhancement algorithm based on sound source localization and scene matching for binaural digital hearing aids[J]. J. Med. Biol. Eng. 39(3), 403–417 (2019)

    Article  Google Scholar 

  12. H. Liu, J. Zhang, Fu. Zhuo, A new hierarchical binaural sound source localization method based on Interaural Matching Filter. IEEE Int. Conf. Robot. Automation (ICRA) (2014). https://doi.org/10.1109/ICRA.2014.6907065

    Article  Google Scholar 

  13. N. Ma, G. J. Brown, Exploiting deep neural networks and head movements for binaural localisation of multiple speakers in reverberant conditions[C]. Proc. Interspeech, pp. 3302−3306 (2015)

  14. D. Marelli, R. Baumgartner, P. Manda, Efficient approximation of head−related transfer functions in subbands for accurate sound localization[J]. IEEE/ACM Trans. Audio, Speech Lang. Process. 23(7), 1130–1143 (2015). https://doi.org/10.1109/TASLP.2015.2425219

    Article  Google Scholar 

  15. Z. Mingru, X. Aimin, Z. Jiaxin et al., Speech enhancement algorithm based on improved signal subspace combined with wiener filtering[J]. Sci. Technol. Eng. 18(3), 74–78 (2018)

    Google Scholar 

  16. NTT, “Multi−Lingual Speech Database for Telephonometry,” NTT Advanced Technology Corporation (NTT−AT) (1994).

  17. C. Pang, H. Liu, J. Zhang, X. Li, Binaural sound localization based on reverberation weighting and generalized parametric mapping. IEEE/ACM Trans. Audio Speech Lang. Process. (2017). https://doi.org/10.1109/TASLP.2017.2703650

    Article  Google Scholar 

  18. B. Rafaely, M. Roccasalvafirenze, E. Payne, Feedback path variability modeling for robust hearing aids[J]. J. Sound Vibr. 302(1), 350–360 (2007). https://doi.org/10.1121/1.428652

    Article  Google Scholar 

  19. M. Raspaud, H. Viste, Evangelista G (2010) “Binaural source localization by joint estimation of ILD and ITD. IEEE Trans. Audio Speech Lang. Process. 18(1), 68–77 (2010). https://doi.org/10.1109/TASL.2009.2023644

    Article  Google Scholar 

  20. C. S. Reddy, R. Agarwal, L. Aggarwal, and R. M. Hegde, Binaural source localization using a HRTF data model with enhanced frequency diversity[C].in 24th European Signal Processing Conference (EUSIPCO), pp. 1463–1467(2016). https://doi.org/10.1109/EUSIPCO.2016.7760491

  21. S. Rinivasan, J. Samuelsson, W. B. Kleijn Codebook−based Bayesian speech enhancement[C]. IEEE International Conference on Acoustics, Speech,and Signal Processing(ICASSP),1:1077−1080(2015). https://doi.org/10.1109/ICASSP.2005.1415304

  22. L.I. Ruwei, Z. Pan Dongmei, Z.Y. Shuang, Binaural Sound source localization algorithm based on HRTF and GMM under gammatone filter decomposition [J]. J Beijing Univ. Technol. 44(11), 185–1390 (2018)

    Google Scholar 

  23. C.H. Taal, R.C. Hendriks, R. Heusdens, J. Jensen, A short−time objective intelligibility measure for time−frequency weighted noisy speech[C]. IEEE Int. Conf. Acoust. Speech Signal Process. (ICASSP) 23(3), 4214–4217 (2010). https://doi.org/10.1109/ICASSP.2010.5495701

    Article  Google Scholar 

  24. X. Wu, D.S. Talagala, Spatial feature learning for robust binaural sound source localization using a composite feature vector. IEEE Int. Conf. Acoust. Speech Signal Process. ICASSP. (2016). https://doi.org/10.1109/ICASSP.2016.7472893

    Article  Google Scholar 

  25. N. Yousefian, P. C. Loizou, “A dual−microphone algorithm that can cope with competing−talker scenarios,” IEEE Transactions on Audio, Speech, and Language Processing, pp.145−155 (2013). https://doi.org/10.1109/TASL.2012.2215594

  26. C. Yu, Research on Chinese Information Extraction Based on Deep Belief Nets[D],Harbin Institute of Technology (2014) (In Chinese)

  27. C. Yu, C. Su, Speech enhancement based on the generalized sidelobe cancellation and spectral subtraction for a microphone array. IEEE Int. Congress Image Signal Process. (CISP) (2015). https://doi.org/10.1109/CISP.2015.7408086

    Article  Google Scholar 

  28. F. Zhao, R. Li, D. Pan. Deep Learning for Binaural Sound Source Localization with Low Signal−to−noise Ratio. The 2020 International Symposium on Automation, Information and Computing (ISAIC 2020), December 2nd−4th, 2020, Beijing China. Journal of Physics: Conference Series (JPCS)(ISSN:1742−6588) (2021)

  29. J. Zou, F. Zhang, “A new generation of hearing aids communication technology: Binaural fusion”. J. Auditory Speech Pathol. 22(1), 15–16 (2014) (In Chinese)

    Google Scholar 

Download references

Acknowledgements

This work was supported by the National Natural Science Foundation of China (No. 61971016).

Author information

Authors and Affiliations

Authors

Contributions

All authors contributed to the study conception and design. Material preparation, data collection and analysis were performed by Ruwei LI, Fengnian ZHAO and Dongmei PAN. The first draft of the manuscript was written by Fengnian ZHAO and Dongmei PAN and all authors commented on previous versions of the manuscript. All authors read and approved the final manuscript.

Corresponding author

Correspondence to Ruwei Li.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Li, R., Zhao, F., Pan, D. et al. Speech Enhancement Based on Binaural Sound Source Localization and Cosh Measure Wiener Filtering. Circuits Syst Signal Process 41, 395–424 (2022). https://doi.org/10.1007/s00034-021-01786-7

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00034-021-01786-7

Keywords

Navigation