Abstract
Multiple speaker localization and tracking in noisy and reverberant environments is a challenging task. A distributed multiple speaker tracking method based on unscented particle filter and data association is proposed in microphone array networks. First, the generalized cross correlation (GCC) function is estimated at each node and the time difference of arrival (TDOA) is viewed as observation for tracking. Considering the ambiguity of the observations caused by the interference of other acoustic sources and multipath, the data association technique is used at each node to associate the available observations with each speaker. Then according to the characteristics of GCC function, the association probability of valid observations for each speaker is calculated by combining the amplitude of the valid observations with the statistical distance. After that, the local state of speakers is obtained at each node based on unscented particle filter. Finally, according to the reliability of the local state of each node, a dynamic weighted consensus fusion algorithm is presented to approximate the global state estimation and obtain good multiple speakers tracking performance. The proposed method can track multiple speakers under reverberant and noisy environments in a distributed manner and is scalable and robust against node failure in DMA. Simulation results verify the effectiveness of the proposed method.
Similar content being viewed by others
References
A. Alexandridis, G. Borboudakis and A. Mouchtaris: Addressing the data-association problem for multiple sound source localization using DOA estimates, in Eur. Signal Process. Conf., EUSIPCO, Nice, France, pp. 1551–1555 (2015). https://doi.org/10.1109/EUSIPCO.2015.7362644
J.B. Allen and B.D. A.: Image method for efficiently simulating small-room acoustics. J Acoust Soc Am. 65(4), 943–950 (1979). https://doi.org/10.1121/1.382599
J. Benesty, J. Chen, Y. Huang, Time-delay estimation via linear interpolation and cross correlation. IEEE Trans Speech Audio Process. 12(5), 509–519 (2004). https://doi.org/10.1109/tsa.2004.833008
A. Bertrand: Applications and trends in wireless acoustic sensor networks: A signal processing perspective, in IEEE Symp. Commun. Veh. Technol. Benelux, SCVT, Ghent, Belgium, pp. 1–6 (2011). https://doi.org/10.1109/scvt.2011.6101302
A. Canclini, P. Bestagini, F. Antonacci et al., A robust and low-complexity source localization algorithm for asynchronous distributed microphone networks. IEEE/ACM Trans Audio Speech Lang Process. 23(10), 1563–1575 (2015). https://doi.org/10.1109/taslp.2015.2439040
S. Gannot, T.G. Dvorkind, Microphone array speaker localizers using spatial-temporal information. Eurasip J. Appl. Sign. Process. 2006(1), 1–17 (2006). https://doi.org/10.1155/asp/2006/59625
J.S. Garofolo, L.F. Lamel, W.M. Fisher, et al.: TIMIT acoustic phonetic continuous speech corpus. In: L. D. Consortium (ed.). Philadelphia, USA, (1993)
Y. Jing, Z. Li, C. Liu, Acoustic source tracking based on adaptive distributed particle filter in distributed microphone networks. Signal Process. 154, 375–386 (2019). https://doi.org/10.1016/j.sigpro.2018.09.023
C.H. Knapp, G.C. Carter, The generalized correlation method for estimation of time delay. IEEE Trans Acoust Speech Signal Process. 24(4), 320–327 (1976). https://doi.org/10.1109/TASSP.1976.1162830
J. Li, A. Nehorai, Distributed particle filtering via optimal fusion of Gaussian mixtures. IEEE Trans. Signal Inf. Proc. Netw. 4(2), 280–292 (2018). https://doi.org/10.1109/tsipn.2017.2694318
P. Lucian, An evaluation of low-power microphone array sound source localization for deforestation detection. Appl Acoust. 113, 162–169 (2016). https://doi.org/10.1016/j.apacoust.2016.06.022
I. Marković, I. Petrović, Speaker localization and tracking with a microphone array mobile robot using Mises distribution and particle filtering. Robot. Auton. Syst. 58, 1185–1196 (2010). https://doi.org/10.1016/j.robot.2010.08.001
A. Mohammadi and A. Asif: Consensus-based distributed unscented particle filter, in IEEE Workshop Stat. Signal Process. Proc., Nice, France, pp. 237–240 (2011). https://doi.org/10.1109/SSP.2011.5967669
S. Ofer, G. Sharon, Speaker tracking using recursive EM algorithms. IEEE Trans. Audio Speech Lang. Process. 22(2), 392–402 (2014). https://doi.org/10.1109/taslp.2013.2292361
R. Olfati-Saber, J.A. Fax, R.M. Murray, Consensus and cooperation in networked multi-agent systems. Proc. IEEE. 95(1), 215–233 (2007). https://doi.org/10.1109/jproc.2006.887293
P. Pertilä and M.S. Hämäläinen: A track before detect approach for sequential Bayesian tracking of multiple speech sources, in ICASSP IEEE Int Conf Acoust Speech Signal Process Proc, Texas, USA, pp. 4974–4977 (2010). https://doi.org/10.1109/ICASSP.2010.5495092
A. Plinge, G.A. Fink, Online Multi-Speaker Tracking Using Multiple Microphone Arrays Informed by Auditory Scene Analysis, in European Signal Proces (Conf, Marrakech, Morocco, 2013), pp. 1–5
I. Potamitis, H. Chen, G. Tremoulis, Tracking of multiple moving speakers with multiple microphone arrays. IEEE Trans Speech Audio Process. 12(5), 520–529 (2004). https://doi.org/10.1109/tsa.2004.833004
S. Scardapane, M. Scarpiniti, M. Bucciarellia et al., Microphone array based classification for security monitoring in unstructured environments. AEU Int. J. Electron. Commun. 69(11), 1715–1723 (2015). https://doi.org/10.1016/j.aeue.2015.08.007
A. Simonetto and T.a. Keviczky: Distributed nonlinear estimation for diverse sensor devices. In: R. Johansson and A. Rantzer (eds.) Distributed Decision Making and Control. Lecture Notes in Control and Information Sciences, vol. 417. pp. 147–169. Springer, London, (2012)
Y. Tian, Z. Chen and F. Yin: Distributed IMM-Unscented Kalman filter for speaker tracking in microphone array networks. IEEE ACM Trans. Audio Speech Lang. Process. 23(10), 1637–1647 (2015). https://doi.org/10.1109/taslp.2015.2442418
L. Wang, T.-K. Hon, J.D. Reiss, et al.: An iterative approach to source counting and localization using two distant microphones. IEEE ACM Trans. Audio Speech Lang. Process. 24(6), 1079–1093 (2016). https://doi.org/10.1109/taslp.2016.2533859
L. Wang, J.D. Reiss and A. Cavallaro: Over-determined source separation and localization using distributed microphones. IEEE ACM Trans. Audio Speech Lang. Process. 24(9), 1573–1588 (2016). https://doi.org/10.1109/taslp.2016.2573048
D.B. Ward, E.A. Lehmann, R.C. Williamson, Particle filtering algorithms for tracking an acoustic source in a reverberant environment. IEEE Trans Speech Audio Process. 11(6), 826–836 (2003). https://doi.org/10.1109/tsa.2003.818112
J. Xu, Z. Zhao, C. Chen, et al.: Multiple concurrent sources localization based on a two-node distributed acoustic sensor network, in Proc SPIE Int Soc Opt Eng, Nanjing, China, pp. 1–8 (2017). https://doi.org/10.1117/12.2266007
Y. Yang, J. Zhang and J. Sun: Speech activity detection and speaker localization based on distributed microphones, in Commun. Comput. Info. Sci., Toronto, Canada, pp. 392–400 (2016). https://doi.org/10.1007/978-3-319-40542-1_64
O. Yilmaz, S. Rickard, Blind separation of speech mixtures via time-frequency masking. IEEE Trans Signal Process. 52(7), 1830–1847 (2004). https://doi.org/10.1109/tsp.2004.828896
C. Zhang, D. Florêncio, D.E. Ba et al., Maximum likelihood sound source localization and beamforming for directional microphone arrays in distributed meetings. IEEE Trans Multimedia. 10(3), 538–548 (2008). https://doi.org/10.1109/TMM.2008.917406
Q. Zhang, Z. Chen and F. Yin: Distributed marginalized auxiliary particle filter for speaker tracking in distributed microphone networks. IEEE ACM Trans. Audio Speech Lang. Process. 24(11), 1921–1934 (2016). https://doi.org/10.1109/taslp.2016.2590146
X. Zhong, J.R. Hopgood, Particle filtering for TDOA based acoustic source tracking: Nonconcurrent multiple talkers. Signal Process. 96, 382–394 (2014). https://doi.org/10.1016/j.sigpro.2013.09.002
X. Zhong, A. Mohammadi, A.B. Premkumar et al., A distributed particle filtering approach for multiple acoustic source tracking using an acoustic vector sensor network. Signal Process. 108, 589–603 (2015). https://doi.org/10.1016/j.sigpro.2014.09.031
Acknowledgements
This work was supported by National Natural Science Foundation of China (Nos. 61771091, 61871066), National High Technology Research and Development Program (863 Program) of China (No. 2015AA016306), Natural Science Foundation of Liaoning Province of China (No. 20170540159) and Fundamental Research Funds for the Central Universities of China (No. DUT17LAB04).
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Data Availability
The datasets generated during and/or analyzed during the current study are available in the TIMIT repository, https://catalog.ldc.upenn.edu/LDC93S1.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Wang, R., Chen, Z. & Yin, F. Distributed Multiple Speaker Tracking Based on Unscented Particle Filter and Data Association in Microphone Array Networks. Circuits Syst Signal Process 41, 933–955 (2022). https://doi.org/10.1007/s00034-021-01812-8
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00034-021-01812-8