Skip to main content

The AIT 3D Audio / Visual Person Tracker for CLEAR 2007

  • Conference paper

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 4625))

Abstract

This paper presents the Athens Information Technology system for 3D person tracking and the obtained results in the CLEAR 2007 evaluations. The system utilizes audiovisual information from multiple acoustic and video sensors. The proposed system comprises a video and an audio subsystem whose results are suitably combined to track the last active speaker. The video subsystem combines in 3D a number of 2D face localization systems, aiming at tracking all people present in a room. The audio subsystem uses an information theoretic metric upon an ensemble of microphones to estimate the active speaker.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Waibe1, A., Steusloff, H., Stiefelhagen, R., et al.: CHIL: Computers in the Human Interaction Loop. In: 5th International Workshop on Image Analysis for Multimedia Interactive Services (WIAMIS), Lisbon, Portugal (April 2004)

    Google Scholar 

  2. Pnevmatikakis, A., Talantzis, F., Soldatos, J., Polymenakos, L.: Robust Multimodal Audio-Visual Processing for Advanced Context Awareness in Smart Spaces. In: Artificial Intelligence Applications and Innovations, Peania, Greece (June 2006)

    Google Scholar 

  3. Zhang, Z.: A Flexible New Technique for Camera Calibration, Technical Report MSR-TR-98-71, Microsoft Research (August 2002)

    Google Scholar 

  4. Stergiou, A., Karame, G., Pnevmatikakis, A., Polymenakos, L.: The AIT 2D face detection and tracking system for CLEAR 2007. In: CLEAR 2007. LNCS, vol. 4625, Springer, Heidelberg (2008)

    Google Scholar 

  5. Talantzis, F., Constantinides, A.G., Polymenakos, L.: Estimation of Direction of Arrival Using Information Theory. IEEE Signal Processing 12(8), 561–564 (2005)

    Article  Google Scholar 

  6. Talantzis, F., Constantinides, A.G., Polymenakos, L.: Real-Time Audio Source Localization Using Information Theory. In: Joint Workshop on Multimodal Interaction and Related Machine Learning Algorithms (MLMI 2006) (May 2006)

    Google Scholar 

  7. Brandstein, M.S., Adcock, J.E., Silverman, H.: A Closed-Form Location Estimator for Use with Room Environment Microphone Arrays. IEEE Trans. on Acoust. Speech and Sig. Proc. 5, 45–50 (1997)

    Article  Google Scholar 

  8. Gordon, N.J., Salmond, D.J., Smith, A.F.M.: Novel approach to nonlinear/nongaussian bayesian state estimation. IEE Proceedings-F (Radar and Signal Processing) 140(2), 107–113 (1993)

    Article  Google Scholar 

  9. Vermaak, J., Blake, A.: Nonlinear filtering for speaker tracking in noisy and reverberant environments. In: Proc. IEEE Int. Conf. on Acoustics, Speech and Signal Processing, Salt Lake City, USA, May 2001, vol. 5, pp. 3021–3024 (2001)

    Google Scholar 

  10. Knapp, C.H., Carter, G.C.: The generalized correlation method for estimation of time delay. IEEE Trans. Acoust., Speech, Signal Process. ASSP-24(4), 320–327 (1976)

    Article  Google Scholar 

  11. Sohn, J., Kim, N.S., Sung, W.: A statistical model-based voice activity detection. IEEE Signal Processing Letters 6, 1–3 (1999)

    Article  Google Scholar 

  12. Lehmann, E.A., Johansson, A.M.: Particle Filter with Integrated Voice Activity Detection for Acoustic Source Tracking. EURASIP Journal on Advances in Signal Processing 2007 Article ID 50870 (2007)

    Google Scholar 

  13. Bolic, M., Djuric, P.M., Hong, S.: New Resampling Algorithms for Particle Filters. In: Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing, Hong Kong, vol. 2, pp. 589–592 (2003)

    Google Scholar 

  14. Pnevmatikakis, A., Polymenakos, L.: 2D Person Tracking Using Kalman Filtering and Adaptive Background Learning in a Feedback Loop. In: Stiefelhagen, R., Garofolo, J.S. (eds.) CLEAR 2006. LNCS, vol. 4122, Springer, Heidelberg (2007)

    Google Scholar 

  15. Mostefa, D., et al.: CLEAR Evaluation Plan, document CHIL-CLEAR-V1.1-2006-02-21 (February 2006)

    Google Scholar 

  16. Blackman, S.: Multiple-Target Tracking with Radar Applications, ch. 14. Artech House, Dedham (1986)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Rainer Stiefelhagen Rachel Bowers Jonathan Fiscus

Rights and permissions

Reprints and permissions

Copyright information

© 2008 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Katsarakis, N., Talantzis, F., Pnevmatikakis, A., Polymenakos, L. (2008). The AIT 3D Audio / Visual Person Tracker for CLEAR 2007. In: Stiefelhagen, R., Bowers, R., Fiscus, J. (eds) Multimodal Technologies for Perception of Humans. RT CLEAR 2007 2007. Lecture Notes in Computer Science, vol 4625. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-68585-2_2

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-68585-2_2

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-68584-5

  • Online ISBN: 978-3-540-68585-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics