Skip to main content

Simultaneous Sound Source Localization by Proposed Cuboids Nested Microphone Array Based on Subband Generalized Eigenvalue Decomposition

  • Conference paper
  • First Online:
Proceedings of the International Conference on Advanced Intelligent Systems and Informatics 2020 (AISI 2020)

Abstract

Multiple sound source localization is an important application in speech processing. In this paper, a cuboids nested microphone array (CuNMA) is proposed for sound acquisition. Also, the spatial aliasing is eliminated by the use of this array. Then, the subband processing is proposed based on the GammaTone filter bank. In the next, the generalized eigenvalue decomposition (GEVD) algorithm is implemented on all microphone pairs of CuNMA and for each obtained subband of the GammaTone filter bank. In each subband, the standard deviation (SD) is calculated for all direction of arrival (DOA) estimations, and the subbands with improper information are eliminated. Then, the K-means clustering with silhouette criteria are implemented on all DOAs for estimating the number of speakers and to allocate the related DOAs for each cluster. The proposed method is compared with steered response power-phase transform (SRP-PHAT), Geometric Projection, and spectral source model-deep neural network (SSM-DNN) on simulated data in noisy and reverberant conditions, which the results show the superiority of the proposed method in comparison with other previous works.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 169.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 219.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Simon, H.J.: Bilateral amplification and sound localization: then and now. J. Rehabil. Res. Dev. 42(4), 117–132 (2005)

    Article  Google Scholar 

  2. Wu, X., Gong, H., Chen, P., Zhong, Z., Xu, Y.: Surveillance robot utilizing video and audio information. J. Intell. Robot. Syst. 55(4/5), 403–421 (2009)

    Article  Google Scholar 

  3. Wang, C., Griebel, S., Brandstein, M.: Robust automatic videoconferencing with multiple cameras and microphones. In: IEEE International Conference on Multimedia and Expo, New York, NY, USA, pp. 1585–1588 (2000)

    Google Scholar 

  4. Blandin, C., Ozerov, A., Vincent, E.: Multi-source TDOA estimation in reverberant audio using angular spectra and clustering. Sig. Process. 92(8), 1950–1960 (2012)

    Article  Google Scholar 

  5. Sheng, X., Hu, Y.H.: Maximum likelihood multiple-source localization using acoustic energy measurements with wireless sensor networks. IEEE Trans. Sig. Process. 53(1), 44–53 (2005)

    Article  MathSciNet  Google Scholar 

  6. Roy, R., Kailath, T.: Esprit-estimation of signal parameters via rotational invariance techniques. IEEE Trans. Acoust. Speech Sig. Process. 37(7), 984–995 (1989)

    Article  Google Scholar 

  7. Schmidt, R.: Multiple emitter location and signal parameter estimation. IEEE Trans. Antennas Propag. 34(3), 276–280 (1986)

    Article  Google Scholar 

  8. Pavlidi, D., Griffin, A., Puigt, M., Mouchtaris, A.: Real-time multiple sound source localization and counting using a circular microphone array. IEEE Trans. Audio Speech Lang. Process. 21(10), 2193–2206 (2013)

    Article  Google Scholar 

  9. Ma, N., Gonzalez, J.A., Brown, G.J.: Robust binaural localization of a target sound source by combining spectral source models and deep neural networks. IEEE/ACM Trans. Audio Speech Lang. Process. 26, 2122–2131 (2018)

    Article  Google Scholar 

  10. Long, T., Chen, J., Huang, G., Benesty, J., Cohen, I.: Acoustic source localization based on geometric projection in reverberant and noisy environments. IEEE J. Sel. Top. Sign. Process. 13(1), 143–155 (2019)

    Article  Google Scholar 

  11. Zheng, Y.R., Goubran, R.A., El-Tanany, M.: Experimental evaluation of a nested microphone array with adaptive noise cancellers. IEEE Trans. Instrum. Measur. 53(3), 777–786 (2004)

    Article  Google Scholar 

  12. Boer, E.D., Kruidenier, C.: On ringing limits of the auditory periphery. Biol. Cybern. 63(6), 433–442 (1990)

    Article  Google Scholar 

  13. Benesty, J.: Adaptive eigenvalue decomposition algorithm for passive acoustic source localization. J. Acoust. Soc. Am. 107, 384–391 (2000)

    Article  Google Scholar 

  14. Peter, J.R.: Silhouettes: a graphical aid to the interpretation and validation of cluster analysis. Comput. Appl. Math. 20, 53–65 (1987)

    Article  Google Scholar 

  15. Garofolo, J.S., Lamel, L.F., Fisher, W.M., Fiscus, J.G., Pallett, D.S., Dahlgren, N.L., Zue, V.: TIMIT acoustic-phonetic continuous speech corpus LDC93S1. Linguistic Data Consortium, Philadelphia. https://catalog.ldc.upenn.edu/LDC93S1. Accessed 20 May 2019

  16. Allen, J., Berkley, D.: Image method for efficiently simulating small-room acoustics. J. Acoust. Soc. Am. 65(4), 943–950 (1979)

    Article  Google Scholar 

  17. Do, H., Silverman, H.F.: SRP-PHAT methods of locating simultaneous multiple talkers using a frame of microphone array data. In: IEEE International Conference on Acoustics, Speech and Signal Processing, Dallas, TX, pp. 125–128 (2010)

    Google Scholar 

Download references

Acknowledgment

The authors acknowledge financial support from: FONDECYT No. 3190147 and FONDECYT No. 11180107.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ali Dehghan Firoozabadi .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2021 The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Firoozabadi, A.D. et al. (2021). Simultaneous Sound Source Localization by Proposed Cuboids Nested Microphone Array Based on Subband Generalized Eigenvalue Decomposition. In: Hassanien, A.E., Slowik, A., Snášel, V., El-Deeb, H., Tolba, F.M. (eds) Proceedings of the International Conference on Advanced Intelligent Systems and Informatics 2020. AISI 2020. Advances in Intelligent Systems and Computing, vol 1261. Springer, Cham. https://doi.org/10.1007/978-3-030-58669-0_72

Download citation

Publish with us

Policies and ethics