Bidirectional microphone array with adaptation controlled by voice activity detector based on multiple beamformers

Šarić, Zoran; Subotić, Miško; Bilibajkić, Ružica; Barjaktarović, Marko

doi:10.1007/s11042-018-6895-3

Bidirectional microphone array with adaptation controlled by voice activity detector based on multiple beamformers

Published: 28 November 2018

Volume 78, pages 15235–15254, (2019)
Cite this article

Multimedia Tools and Applications Aims and scope Submit manuscript

Zoran Šarić ORCID: orcid.org/0000-0001-9964-9974¹,
Miško Subotić¹,
Ružica Bilibajkić¹ &
…
Marko Barjaktarović²

207 Accesses
3 Citations
6 Altmetric
Explore all metrics

Abstract

Ambient noise suppression in a reverberant room is usually performed by the microphone array. The adaptive beamforming, whose typical representative is minimum variance distortionless (MVDR) beamformer, is an effective method for noise suppression. However, MVDR beamformer gives poor results in the real room because of its sensitivity to the steering error and the multipath wave propagation. In this paper we propose a noise suppression method based on assumption that the positions of the speakers in the reverberant room are roughly known. Noise reduction is realized by two MVDR beamformers directed toward each of the speakers. Adaptation of the MVDR beamformers are controlled by a speaker activity detector which decision is based on power transfer model of the multiple superdirective beamformers in combined diffuse and coherent noise field. The proposed voice activity detector also provides residual noise reduction. The proposed method and its robustness to steering error were tested on the model of simulated room as well as in real room environment. The improvement of the restored speech signal was evaluated by Signal to Noise Ratio Enhancement (SNRE) and by Perceptual evaluation of speech quality (PESQ) measure.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Mask-Based Beamforming Applied to the End-Fire Microphone Array

Article 02 November 2023

An integrated MVDR beamformer for speech enhancement using a local microphone array and external microphones

Article Open access 10 February 2021

Speech Enhancement with Microphone Array Using a Multi Beam Adaptive Noise Suppressor

Notes

Strictly speaking, it is not beamformer because it uses only one microphone, i.e. fourth microphone, with omnidirectional characteristic.
In experimental tests we used small value of λ, λ=0.25 which provides fast tracking of the power change.
In practice, there is one more hypothesis when both speakers speak simultaneously. In this case we assume that the louder speaker is active.
In this test case SNRE is ratio of speech energy during speech segment and residual noise in pause segment attenuated by (19).
PESQ in this test case relates to whole signal displayed in Fig. 8e (signal with additional noise attenuation by (19)).

References

Agnew J, Thornton MJ (2000) Just noticeable and objectionable group delays in digital hearing aids. J Am Acad Audiol 11(6):330–336
Google Scholar
Air conditioner sounds https://www.soundsnap.com/tags/air_conditioner. Accessed: 2017-05-25
Allen JB, Berkley DA (1979) Image method for efficiently simulating small-room acoustics. J Acoust Soc Am 65(4):943–950
Article Google Scholar
Bitzer J, Uwe Simmer K (2001) Superdirective microphone arrays. Microphone arrays. Springer, Berlin, pp 19–38
Book Google Scholar
Cabañas-Molero P et al. (2018) Multimodal speaker diarization for meetings using volume-evaluated SRP-PHAT and video analysis. Multimed Tools Appl: 1–23. https://doi.org/10.1007/s11042-018-5944-2
Defatta DJ, Lucas JG, Hodgkiss WS (1988) Digital signal processing: a system design approach
Farhang-Boroujeny B (1998) Adaptive filters: theory and applications. John Wiley & Sons, Inc., New York
MATH Google Scholar
Frost LO III (1972) An algorithm for linearly constrained adaptive array processing. Proc IEEE 60:926–935 (Frost, 1972)
Article Google Scholar
Griffiths L, Jim CW (1982) An alternative approach to linearly constrained adaptive beamforming. IEEE Trans Antennas Propag 30(1):27–34
Article Google Scholar
Hoshuyama O, Sugiyama A, Hirano A (1999) A robust adaptive beamformer for microphone arrays with a blocking matrix using constrained adaptive filters. IEEE Trans Signal Process 47:2677–2684
Article Google Scholar
ITU-T (2001) Perceptual evaluation of speech quality (PESQ): An objective method for end-to-end speech quality assessment of narrow-band telephone networks and speech codecs. Int Telecomm Union
ITU-T Test Signals for Telecommunication Systems http://www.itu.int/net/itu-t/sigdb/genaudio/Pseries.htm.Accessed: 2018-02-07
Jovičić TS, Šarić MZ, Turajlić RS (2005) Application of the maximum signal to interference criterion to the adaptive microphone array. Acoustics Research Letters Online (ARLO) 6(4):232–237
Article Google Scholar
Marro C, Mahieux Y, Simmer UK (1998) Analysis of noise reduction and dereverberation techniques based on microphone arrays with postfiltering. IEEE Trans Speech Audio Process 6(3):240–259
Article Google Scholar
McCowan AI, Bourlard H (2003) Microphone array post-filter based on noise field coherence. IEEE Transactions on Speech and Audio Processing, 11(6). (McCowan and Bourlard (2003)
Papp II, Šarić MZ, Jovičić TS, Teslić DN (2007) Adaptive microphone array for unknown desired speaker’s transfer function. JASA Express Lett 122(2):EL44–EL49
Google Scholar
Parra L, Alvino C (2002) Geometric source separation: merging convolutive source separation with geometric beamforming. IEEE Trans Speech Audio Process 10(6):352–362
Article Google Scholar
Parra L, Spence C (2000) Convolutive blind separation of non-stationary sources. IEEE Trans Speech Audio Process 8(3):320–327
Article MATH Google Scholar
Šarić MZ, Jovičić TS (2004) Adaptive microphone array based on pause detection. Acoust Res Lett Online (ARLO) 5(2):68–74
Article Google Scholar
Šarić MZ, Simić PD, Jovičić TS (2011) A new post-filter algorithm combined with two-step adaptive beam former. Circ Syst Sign Process 30:483–500. https://doi.org/10.1007/s00034-010-9233-1, printed, CSSP(2011)
Article MATH Google Scholar
Simmer KU, Bitzer J, Marro C (2001) Post-filtering techniques. Microphone arrays. Springer, Berlin, pp 39–60
Book Google Scholar
Spriet A, MooNEN MARC, Wouters J (2002) A multi-channel subband generalized singular value decomposition approach to speech enhancement. Trans Emerg Telecomm Technol 13(2):149–158
Article Google Scholar
Van Trees HL (2004) Optimum array processing: part IV of detection, estimation, and modulation theory. John Wiley & Sons
Wang L, Ding H, Fuliang Y (2010) Combining superdirective beamforming and frequency-domain blind source separation for highly reverberant signals. EURASIP J Audio, Speech Music Process 1(2010):797962
Article Google Scholar
White G, Louie GJ (2005) The audio dictionary: revised and expanded. University of Washington Press
Wölfel M, McDonough J (2009) Distant speech recognition. John Wiley & Sons
Yan C, Xie H, Yang D, Yin J, Zhang Y, Dai Q (2018) Supervised hash coding with deep neural network for environment perception of intelligent vehicles. IEEE Trans Intell Transport Syst 19(1):284–295
Article Google Scholar
Zelinski R (1988) A microphone array with adaptive post-filtering for noise reduction in reverberant rooms. Proc ICASSP88: 2578–2581

Download references

Acknowledgements

This research was supported by grants 178027, TR32032 and TR32035 from the Ministry of Education, Science and Technological Development of the Republic of Serbia.

Author information

Authors and Affiliations

Laboratory of Acoustics, Life Activities Advancement Center, Gospodar Jovanova 35, Belgrade, 11000, Serbia
Zoran Šarić, Miško Subotić & Ružica Bilibajkić
Faculty of Electrical Engineering University of Belgrade, Belgrade, Serbia
Marko Barjaktarović

Authors

Zoran Šarić
View author publications
You can also search for this author in PubMed Google Scholar
Miško Subotić
View author publications
You can also search for this author in PubMed Google Scholar
Ružica Bilibajkić
View author publications
You can also search for this author in PubMed Google Scholar
Marko Barjaktarović
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Zoran Šarić.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix: Transfer of the acoustic power by diffuse noise field

Transfer of the diffuse component of the acoustic power from the acoustic source to the output of the beamformer is defined by linear transfer factor.

$$ {\beta}_k=\frac{P_{diff,k}}{P_s} $$

(22)

where P_{diff, k} is total diffuse power at the output of the beamformer k, P_sis the power of the acoustic source measured at distance 1 m. Taking into account directivity of the microphone array defined by beam pattern h_к(j, ϕ, θ), diffuse power component is.

(23)

where D_k(j) is directivity factor, P_{dif _ array} is diffuse power component at microphone array position. Diffuse power is uniformly distributed in the room. It is equal to the direct path power at critical distance d_c.

$$ {P}_{dif\_ array}={P}_{direct}={P}_s{\left(1/{d}_c\right)}^2 $$

(24)

Substituting (24), (23) into (22) we obtain.

$$ {\beta}_k=\frac{1}{d_c^2{D}_k(j)} $$

(25)

Rights and permissions

Reprints and permissions

About this article

Cite this article

Šarić, Z., Subotić, M., Bilibajkić, R. et al. Bidirectional microphone array with adaptation controlled by voice activity detector based on multiple beamformers. Multimed Tools Appl 78, 15235–15254 (2019). https://doi.org/10.1007/s11042-018-6895-3

Download citation

Received: 05 June 2018
Revised: 31 October 2018
Accepted: 13 November 2018
Published: 28 November 2018
Issue Date: 15 June 2019
DOI: https://doi.org/10.1007/s11042-018-6895-3

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Bidirectional microphone array with adaptation controlled by voice activity detector based on multiple beamformers

Abstract

Access this article

Similar content being viewed by others

Mask-Based Beamforming Applied to the End-Fire Microphone Array

An integrated MVDR beamformer for speech enhancement using a local microphone array and external microphones

Speech Enhancement with Microphone Array Using a Multi Beam Adaptive Noise Suppressor

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s Note

Appendix: Transfer of the acoustic power by diffuse noise field

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Bidirectional microphone array with adaptation controlled by voice activity detector based on multiple beamformers

Abstract

Access this article

Similar content being viewed by others

Mask-Based Beamforming Applied to the End-Fire Microphone Array

An integrated MVDR beamformer for speech enhancement using a local microphone array and external microphones

Speech Enhancement with Microphone Array Using a Multi Beam Adaptive Noise Suppressor

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s Note

Appendix: Transfer of the acoustic power by diffuse noise field

Appendix: Transfer of the acoustic power by diffuse noise field

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation