Speech Enhancement Based on Binaural Sound Source Localization and Cosh Measure Wiener Filtering

Li, Ruwei; Zhao, Fengnian; Pan, Dongmei; Dong, Liang

doi:10.1007/s00034-021-01786-7

Speech Enhancement Based on Binaural Sound Source Localization and Cosh Measure Wiener Filtering

Published: 22 July 2021

Volume 41, pages 395–424, (2022)
Cite this article

Circuits, Systems, and Signal Processing Aims and scope Submit manuscript

Ruwei Li ORCID: orcid.org/0000-0002-7828-2242¹,
Fengnian Zhao¹,
Dongmei Pan¹ &
…
Liang Dong²

424 Accesses
2 Citations
1 Altmetric
Explore all metrics

Abstract

The existing speech enhancement algorithm has shown poor performance under low Signal Noise Ratios (SNRs). To resolve this problem, a speech enhancement algorithm based on binaural sound source localization and cosh measure filtering is proposed. Firstly, the algorithm uses a sound source localization algorithm based on head correlation functions and two-level deep learning to extract the spatial information of the binaural sound source and determine the spatial position of the sound source. The beamforming method is then used to remove the noises in different directions from the speech. Finally, the Wiener filtering of cosh measure based on logarithmic relation is used to remove the noise in the same direction as the speech to achieve speech enhancement. Experiments show that the proposed algorithm has better robustness and denoising ability than the contrast algorithms.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Speech Enhancement Algorithm Based on Sound Source Localization and Scene Matching for Binaural Digital Hearing Aids

Article 27 April 2018

A New Neural Beamformer for Multi-channel Speech Separation

Article 09 May 2022

Sound source localization based on residual network and channel attention module

Article Open access 03 April 2023

References

D. Ayllón, R. Gil−Pita, M. Rosa−Zurera, A machine learning approach for computationally and energy efficient speech enhancement in binaural hearing aids,” IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp.6515−6519 (2016). https://doi.org/10.1109/ICASSP.2016.7472932
Y. Bengio, Learning deep architectures for AI[J]. Found. Trends Mach. Learn. 2(1), 1–127 (2009)
Article MathSciNet Google Scholar
S. Doclo, M. Moonen, T. Van den Bogaert, Reduced−bandwidth and distributed mwf−based noise reduction algorithms for binaural hearing aids. IEEE Trans. Audio, Speech, Lang. Process. (2009). https://doi.org/10.1109/TASL.2008.2004291
Article Google Scholar
W. Dongxia, Z. Jiachao, F. Zhenwei et al., Broadband beamforming for speech enhancement in reverberation environment. Comput. Eng. Appl. 48(34), 136–139 (2012)
Google Scholar
Y. Fang, F. Haihong, C. Youyuan, A binaural speech enhancement algorithm: Application to background and directional noise fields”. Int. Congress Image Signal Process. (CISP) (2015). https://doi.org/10.1109/CISP.2015.7408075
Article Google Scholar
M. Geravanchizadeh, S. Ghalami Osgouei, Dual−channel speech enhancement using normalized fractional least−mean−squares algorithm[C]. Iranian Conference on Electrical Engineering (2011)
A. Gore, S. Chakrabartty, A min−max optimization framework for designing learners: theory and hardware[j] circuits and systems i: regular papers. IEEE Trans. 57(3), 604–617 (2010). https://doi.org/10.1109/TCSI.2009.2025002
Article MathSciNet Google Scholar
A. Gray, J. Markel, Distance measures for speech processing [J] IEEE Trans. Acoust Speech Signal Process. ASSP 24(5), 380–391 (1976). https://doi.org/10.1109/TASSP.1976.1162849
Article Google Scholar
J. Hansen, B. Pellom, An efficient quality evaluation protocol for speech enhancement algorithms[C]. Int. Conf. Spoken Lang. Process. 7, 2819–2822 (1998)
Google Scholar
ITU. ITU−T Recommendation p.862, Perceptual evaluation of speech quality(PESQ), an objective method for end−to−end speech quality assessment of narrowband telephone networks and speech codes[S](2000)
R. Li, D. Pan, S. Zhang, Speech enhancement algorithm based on sound source localization and scene matching for binaural digital hearing aids[J]. J. Med. Biol. Eng. 39(3), 403–417 (2019)
Article Google Scholar
H. Liu, J. Zhang, Fu. Zhuo, A new hierarchical binaural sound source localization method based on Interaural Matching Filter. IEEE Int. Conf. Robot. Automation (ICRA) (2014). https://doi.org/10.1109/ICRA.2014.6907065
Article Google Scholar
N. Ma, G. J. Brown, Exploiting deep neural networks and head movements for binaural localisation of multiple speakers in reverberant conditions[C]. Proc. Interspeech, pp. 3302−3306 (2015)
D. Marelli, R. Baumgartner, P. Manda, Efficient approximation of head−related transfer functions in subbands for accurate sound localization[J]. IEEE/ACM Trans. Audio, Speech Lang. Process. 23(7), 1130–1143 (2015). https://doi.org/10.1109/TASLP.2015.2425219
Article Google Scholar
Z. Mingru, X. Aimin, Z. Jiaxin et al., Speech enhancement algorithm based on improved signal subspace combined with wiener filtering[J]. Sci. Technol. Eng. 18(3), 74–78 (2018)
Google Scholar
NTT, “Multi−Lingual Speech Database for Telephonometry,” NTT Advanced Technology Corporation (NTT−AT) (1994).
C. Pang, H. Liu, J. Zhang, X. Li, Binaural sound localization based on reverberation weighting and generalized parametric mapping. IEEE/ACM Trans. Audio Speech Lang. Process. (2017). https://doi.org/10.1109/TASLP.2017.2703650
Article Google Scholar
B. Rafaely, M. Roccasalvafirenze, E. Payne, Feedback path variability modeling for robust hearing aids[J]. J. Sound Vibr. 302(1), 350–360 (2007). https://doi.org/10.1121/1.428652
Article Google Scholar
M. Raspaud, H. Viste, Evangelista G (2010) “Binaural source localization by joint estimation of ILD and ITD. IEEE Trans. Audio Speech Lang. Process. 18(1), 68–77 (2010). https://doi.org/10.1109/TASL.2009.2023644
Article Google Scholar
C. S. Reddy, R. Agarwal, L. Aggarwal, and R. M. Hegde, Binaural source localization using a HRTF data model with enhanced frequency diversity[C].in 24th European Signal Processing Conference (EUSIPCO), pp. 1463–1467(2016). https://doi.org/10.1109/EUSIPCO.2016.7760491
S. Rinivasan, J. Samuelsson, W. B. Kleijn Codebook−based Bayesian speech enhancement[C]. IEEE International Conference on Acoustics, Speech,and Signal Processing(ICASSP),1:1077−1080(2015). https://doi.org/10.1109/ICASSP.2005.1415304
L.I. Ruwei, Z. Pan Dongmei, Z.Y. Shuang, Binaural Sound source localization algorithm based on HRTF and GMM under gammatone filter decomposition [J]. J Beijing Univ. Technol. 44(11), 185–1390 (2018)
Google Scholar
C.H. Taal, R.C. Hendriks, R. Heusdens, J. Jensen, A short−time objective intelligibility measure for time−frequency weighted noisy speech[C]. IEEE Int. Conf. Acoust. Speech Signal Process. (ICASSP) 23(3), 4214–4217 (2010). https://doi.org/10.1109/ICASSP.2010.5495701
Article Google Scholar
X. Wu, D.S. Talagala, Spatial feature learning for robust binaural sound source localization using a composite feature vector. IEEE Int. Conf. Acoust. Speech Signal Process. ICASSP. (2016). https://doi.org/10.1109/ICASSP.2016.7472893
Article Google Scholar
N. Yousefian, P. C. Loizou, “A dual−microphone algorithm that can cope with competing−talker scenarios,” IEEE Transactions on Audio, Speech, and Language Processing, pp.145−155 (2013). https://doi.org/10.1109/TASL.2012.2215594
C. Yu, Research on Chinese Information Extraction Based on Deep Belief Nets[D],Harbin Institute of Technology (2014) (In Chinese)
C. Yu, C. Su, Speech enhancement based on the generalized sidelobe cancellation and spectral subtraction for a microphone array. IEEE Int. Congress Image Signal Process. (CISP) (2015). https://doi.org/10.1109/CISP.2015.7408086
Article Google Scholar
F. Zhao, R. Li, D. Pan. Deep Learning for Binaural Sound Source Localization with Low Signal−to−noise Ratio. The 2020 International Symposium on Automation, Information and Computing (ISAIC 2020), December 2nd−4th, 2020, Beijing China. Journal of Physics: Conference Series (JPCS)(ISSN:1742−6588) (2021)
J. Zou, F. Zhang, “A new generation of hearing aids communication technology: Binaural fusion”. J. Auditory Speech Pathol. 22(1), 15–16 (2014) (In Chinese)
Google Scholar

Download references

Acknowledgements

This work was supported by the National Natural Science Foundation of China (No. 61971016).

Author information

Authors and Affiliations

Faculty of Information Technology, Beijing University of Technology, Beijing, China
Ruwei Li, Fengnian Zhao & Dongmei Pan
Electrical and Computer Engineering, Baylor University, Waco, TX, USA
Liang Dong

Authors

Ruwei Li
View author publications
You can also search for this author in PubMed Google Scholar
Fengnian Zhao
View author publications
You can also search for this author in PubMed Google Scholar
Dongmei Pan
View author publications
You can also search for this author in PubMed Google Scholar
Liang Dong
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

All authors contributed to the study conception and design. Material preparation, data collection and analysis were performed by Ruwei LI, Fengnian ZHAO and Dongmei PAN. The first draft of the manuscript was written by Fengnian ZHAO and Dongmei PAN and all authors commented on previous versions of the manuscript. All authors read and approved the final manuscript.

Corresponding author

Correspondence to Ruwei Li.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Li, R., Zhao, F., Pan, D. et al. Speech Enhancement Based on Binaural Sound Source Localization and Cosh Measure Wiener Filtering. Circuits Syst Signal Process 41, 395–424 (2022). https://doi.org/10.1007/s00034-021-01786-7

Download citation

Received: 25 November 2019
Revised: 28 June 2021
Accepted: 30 June 2021
Published: 22 July 2021
Issue Date: January 2022
DOI: https://doi.org/10.1007/s00034-021-01786-7

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Speech Enhancement Based on Binaural Sound Source Localization and Cosh Measure Wiener Filtering

Abstract

Access this article

Similar content being viewed by others

Speech Enhancement Algorithm Based on Sound Source Localization and Scene Matching for Binaural Digital Hearing Aids

A New Neural Beamformer for Multi-channel Speech Separation

Sound source localization based on residual network and channel attention module

References

Acknowledgements

Author information

Authors and Affiliations

Contributions

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Speech Enhancement Based on Binaural Sound Source Localization and Cosh Measure Wiener Filtering

Abstract

Access this article

Similar content being viewed by others

Speech Enhancement Algorithm Based on Sound Source Localization and Scene Matching for Binaural Digital Hearing Aids

A New Neural Beamformer for Multi-channel Speech Separation

Sound source localization based on residual network and channel attention module

References

Acknowledgements

Author information

Authors and Affiliations

Contributions

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation