Skip to main content
Log in

Real-time multiple sound source localization and counting using a soundfield microphone

  • Original Research
  • Published:
Journal of Ambient Intelligence and Humanized Computing Aims and scope Submit manuscript

Abstract

In this work, a multiple sound source localization and counting method based on a relaxed sparsity of speech signal is presented. A soundfield microphone is adopted to overcome the redundancy and complexity of microphone array in this paper. After establishing an effective measure, the relaxed sparsity of speech signals is investigated. According to this relaxed sparsity, we can obtain an extensive assumption that “single-source” zones always exist among the soundfield microphone signals, which is validated by statistical analysis. Based on “single-source” zone detecting, the proposed method jointly estimates the number of active sources and their corresponding DOAs by applying a peak searching approach to the normalized histogram of estimated DOA. The cross distortions caused by multiple simultaneously occurring sources are solved by estimating DOA in these “single-source” zones. The evaluations reveal that the proposed method achieves a higher accuracy of DOA estimation and source counting compared with the existing techniques. Furthermore, the proposed method has higher efficiency and lower complexity, which makes it suitable for real-time applications.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14

Similar content being viewed by others

References

  • Argentieri S, Danes P(2007) Broadband variations of the music high-resolution method for sound source localization in robotics. In: IEEE/RSJ International Conference on Intelligent Robots and Systems, 2007. IROS 2007. pp 2009–2014

  • Asaei A, Taghizadeh MJ, Haghighatshoar S, Raj B, Bourlard H, Cevher V (2016) Binary sparse coding of convolutive mixtures for sound localization and separation via spatialization. IEEE Trans Signal Process 64(3):567–579

    Article  MathSciNet  Google Scholar 

  • Bechler D, Kroschel K (2003) Considering the second peak in the gcc function for multi-source tdoa estimation with a microphone array. In: Proceedings of the International Workshop on Acoustic Echo and Noise Control (IWAENC ’03), pp 315–318

  • Bechler D, Schlosser MS, Kroschel K (2004) System for robust 3d speaker tracking using microphone array measurements. In: Proceedings 2004 IEEE/RSJ International Conference on Intelligent Robots and Systems, 2004. (IROS 2004), vol 3, pp 2117–2122

  • Belloni F, Koivunen V (2003) Unitary root-music technique for uniform circular array. In: Proceedings of the 3rd IEEE International Symposium on Signal Processing and Information Technology, 2003. ISSPIT 2003. pp 451–454

  • Benesty J, Chen J, Huang Y (2004) Time-delay estimation via linear interpolation and cross correlation. IEEE Trans Speech Audio Process 12(5):509–519

    Article  Google Scholar 

  • Blandin C, Ozerov A, Vincent E (1950) Multi-source tdoa estimation in reverberant audio using angular spectra and clustering. Signal Process 92(8):1950–1960

    Article  Google Scholar 

  • Campbell DR, Palomki KJ, Brown GJ (2005) A matlab simulation of “shoebox” room acoustics for use in research and teaching. Comput Inf Syst J 9(3):48–51

    Google Scholar 

  • Cobos M, Lopez JJ, Martinez D (2011) Two-microphone multi-speaker localization based on a Laplacian mixture model. Digit Signal Process 21(1):66–76

    Article  Google Scholar 

  • Dmochowski J, Benesty J, Affes S (2007a) Direction of arrival estimation using the parameterized spatial correlation matrix. IEEE Trans Audio Speech Lang Process 15(4):1327–1339

    Article  Google Scholar 

  • Dmochowski JP, Benesty J, Affes S (2007b) Broadband music: Opportunities and challenges for multiple source localization. In: IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, 2007, pp 18–21

  • Gunel B, Hacihabiboglu H, Kondoz AM (2008) Acoustic source separation of convolutive mixtures based on intensity vector statistics. IEEE Trans Audio Speech Lang Process 16(4):748–756

    Article  Google Scholar 

  • Ishi CT, Chatot O, Ishiguro H, Hagita N (2009a) Evaluation of a music-based real-time sound localization of multiple sound sources in real noisy environments. In :IEEE/RSJ International Conference on Intelligent Robots and Systems, 2009. IROS 2009. pp 2027–2032

  • Ishi CT, Chatot O, Ishiguro H, Hagita N (2009b) Evaluation of a music-based real-time sound localization of multiple sound sources in real noisy environments. In: IEEE/RSJ International Conference on Intelligent Robots and Systems, 2009. IROS 2009. pp 2027–2032

  • Jia M, Yang Z, Bao C, Zheng X, Ritz C (2015) Encoding multiple audio objects using intra-object sparsity. IEEE/ACM Trans Audio Speech Lang Process 23(6):1082–1095

    Article  Google Scholar 

  • Karbasi A, Sugiyama A (2007) A new DOA estimation method using a circular microphone array. In: Signal Processing Conference, 2007 15th European, pp 778–782

  • Knapp C, Carter G (1976) The generalized correlation method for estimation of time delay. IEEE Trans Acoustics Speech Signal Process 24(4):320–327

    Article  Google Scholar 

  • Loesch B, Uhlich S, Yang B (2009) Multidimensional localization of multiple sound sources using frequency domain ica and an extended state coherence transform. In: IEEE/SP 15th Workshop on Statistical Signal Processing, 2009. SSP ’09. pp 677–680

  • Lombard A, Zheng Y, Buchner H, Kellermann W (2011) Tdoa estimation for multiple sound sources in noisy and reverberant environments using broadband independent component analysis. IEEE Trans Audio Speech Lang Process 19(6):1490–1503

    Article  Google Scholar 

  • Nakadai K, Matsuura D, Okuno HG, Kitano H (2003) Applying scattering theory to robot audition system: robust sound source localization and extraction. In: Proceedings 2003 IEEE/RSJ International Conference on Intelligent Robots and Systems, 2003. (IROS 2003). vol 2, pp 1147–1152

  • Nesta F, Omologo M (2012) Generalized state coherence transform for multidimensional tdoa estimation of multiple sources. IEEE Trans Audio Speech Lang Process 20(1):246–260

    Article  Google Scholar 

  • Comon P, Jutten C (2010) Handbook of blind source separation: independent component analysis and applications. Academic Press, Elsevier, Burlington

  • Pavlidi D, Griffin A, Puigt M, Mouchtaris A (2013) Real-time multiple sound source localization and counting using a circular microphone array. IEEE Trans Audio Speech Lang Process 21(10):2193–2206

    Article  Google Scholar 

  • Pavlidi D, Puigt M, Griffin A, Mouchtaris A (2012) Real-time multiple sound source localization using a circular microphone array based on single-source confidence measures. In: IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2012. pp 2625–2628

  • Pulkki V (2007) Spatial sound reproduction with directional audio coding. J Audio Eng Soc 55(6):503–516

    Google Scholar 

  • Ren M, Zou YX (2012) A novel multiple sparse source localization using triangular pyramid microphone array. IEEE Signal Process Lett 19(2):83–86

    Article  Google Scholar 

  • Sawada H, Mukai R, Araki S, Malcino S (2005) Multiple source localization using independent component analysis. In: Antennas and Propagation Society International Symposium, 2005 IEEE, vol 4B, pp 81–84

  • Schmidt R (1986) Multiple emitter location and signal parameter estimation. IEEE Trans Antennas Propag 34(3):276–280

    Article  Google Scholar 

  • Shiiki Y, Suyama K (2015) Omnidirectional sound source tracking based on sequential updating histogram. In: 2015 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA), pp 1249–1256

  • Shujau M, Ritz CH, Burnett IS (2011) Separation of speech sources using an acoustic vector sensor. In: IEEE 13th International Workshop on Multimedia Signal Processing (MMSP), 2011, pp 1–6

  • Sound C (2015) Core sound TetraMic. http://www.core-sound.com/TetraMic/1.php. Online; Accessed 25 Sep 2015

  • Su D, Miro JV, Vidal-Calleja T (2015) Real-time sound source localisation for target tracking applications using an asynchronous microphone array. In: IEEE 10th Conference on Industrial Electronics and Applications (ICIEA), 2015, pp 1261–1266

  • Swartling M, Sllberg B, Grbi N (2011) Source localization for multiple speech sources using low complexity non-parametric source separation and clustering. Signal Process 91(8):1781–1788

    Article  MATH  Google Scholar 

  • Tim VDB, Evelyne C, Jan W (2011) Sound source localization using hearing aids with microphones placed behind-the-ear, in-the-canal, and in-the-pinna. Int J Audiol 50(3):164–176

    Article  Google Scholar 

  • Yi Z, Kuroda T (2014) Wearable sensor-based human activity recognition from environmental background sounds. J Ambient Intell Humaniz Comput 5(1):77–89

    Article  Google Scholar 

  • Zhang JX, Christensen MG, Dahl J, Jensen SH, Moonen M (2009) Robust implementation of the music algorithm. In: IEEE International Conference on Acoustics, Speech and Signal Processing, 2009. ICASSP 2009., pp 3037–3040

  • Zheng X (2013) Soundfield navigation: separation, compression and transmission, doctoral dissertation. University of Wollongong, Wollongong

    Google Scholar 

  • Zheng X, Ritz C, Xi J (2013) Collaborative blind source separation using location informed spatial microphones. IEEE Signal Process Lett 20(1):83–86

    Article  Google Scholar 

  • Zheng X, Ritz C, Xi J (2016) Encoding and communicating navigable speech soundfields. Multimed Tools Appl 75(9):5183–5204

    Article  Google Scholar 

Download references

Acknowledgments

This work has been supported by the National Natural Science Foundation of China (Nos. 61231015, 61201197), Specialized Research Fund for the Doctoral Program of Higher Education of the People’s Republic of China (No. 20121103120017), the Project supported by Beijing Postdoctoral Research Foundation.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Maoshen Jia.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Jia, M., Sun, J. & Bao, C. Real-time multiple sound source localization and counting using a soundfield microphone. J Ambient Intell Human Comput 8, 829–844 (2017). https://doi.org/10.1007/s12652-016-0388-x

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s12652-016-0388-x

Keywords

Navigation