Abstract
In prior work, we developed a speaker tracking system based on an extended Kalman filter using time delays of arrival (TDOAs) as acoustic features. In particular, the TDOAs comprised the observation associated with an iterated extended Kalman filter (IEKF) whose state corresponds to the speaker position. In other work, we followed the same approach to develop a system that could use both audio and video information to track a moving lecturer. While these systems functioned well, their utility was limited to scenarios in which a single speaker was to be tracked. In this work, we seek to remove this restriction by generalizing the IEKF, first to a probabilistic data association filter, which incorporates a clutter model for rejection of spurious acoustic events, and then to a joint probabilistic data association filter (JPDAF), which maintains a separate state vector for each active speaker. In a set of experiments conducted on seminar and meeting data, we demonstrate that the JPDAF provides tracking performance superior to the IEKF.
This work was sponsored by the European Union under the integrated project CHIL, Computers in the Human Interaction Loop, contract number 506909.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Omologo, M., Svaizer, P.: Acoustic event localization using a crosspower-spectrum phase based technique. In: Proc. ICASSP, vol. 2, pp. 273–276 (1994)
Kay, S.: Fundamentals of Statistical Signal Processing: Estimation Theory. Prentice-Hall, Englewood Cliffs (1993)
Klee, U., Gehrig, T., McDonough, J.: Kalman filters for time delay of arrival-based source localization. Journal of Advanced Signal Processing, Special Issue on Multi-Channel Speech Processing (to appear)
Brandstein, M.S., Adcock, J.E., Silverman, H.F.: A closed-form location estimator for use with room environment microphone arrays. IEEE Trans. Speech Audio Proc. 5(1), 45–50 (1997)
Gehrig, T., Nickel, K., Ekenel, H.K., Klee, U., McDonough, J.: Kalman filters for audio-video source localization. In: Proc. Workshop on Applications of Signal Processing to Audio and Acoustics, New Paltz, New York (2005)
Strobel, N., Spors, S., Rabenstein, R.: Joint audio-video signal processing for object localization and tracking. In: Brandstein, M., Ward, D. (eds.) Microphone Arrays, Springer, Heidelberg (2001)
Welch, G., Bishop, G.: SCAAT: Incremental tracking with incomplete information. In: Proc. Computer Graphics and Interactive Techniques (Aug. (1997)
Gennari, G., Hager, G.D.: Probabilistic data association methods in the visual tracking of groups. In: Proc. CVPR, pp. 1063–1069 (2004)
Bechler, D.: Akustische Sprecherlokalisation mit Hilfe eines Mikrofonarrays. Ph.D. dissertation, Universität Karlsruhe, Karlsruhe, Germany (2006)
Bar-Shalom, Y., Fortmann, T.E.: Tracking and Data Association. Academic Press, San Diego (1988)
Ajmera, J., Lathoud, G., McCowan, I.: Clustering and segmenting speakers and their locations in meetings. In: Proc. ICASSP, pp. I–605–608 (2004)
Chen, J., Benesty, J., Huang, Y.A.: Robust time delay estimation exploiting redundancy among multiple microphones. IEEE Trans. Speech Audio Proc. 11(6), 549–557 (2003)
Jazwinski, A.H.: Stochastic Processes and Filtering Theory. Academic Press, New York (1970)
Armani, L., Matassoni, M., Omologo, M., Svaizer, P.: Use of a CSP-based voice activity detector for distant-talking ASR. In: Proc. Eurospeech, vol. 2, pp. 501–504 (2003)
Zhang, Z.: A flexible new technique for camera calibration. IEEE Trans. Pattern Analysis Machine Intel. 22, 1330–1334 (2000)
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 2007 Springer Berlin Heidelberg
About this paper
Cite this paper
Gehrig, T., McDonough, J. (2007). Tracking Multiple Speakers with Probabilistic Data Association Filters. In: Stiefelhagen, R., Garofolo, J. (eds) Multimodal Technologies for Perception of Humans. CLEAR 2006. Lecture Notes in Computer Science, vol 4122. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-69568-4_10
Download citation
DOI: https://doi.org/10.1007/978-3-540-69568-4_10
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-69567-7
Online ISBN: 978-3-540-69568-4
eBook Packages: Computer ScienceComputer Science (R0)