Skip to main content

Understanding Interactions and Guiding Visual Surveillance by Tracking Attention

  • Conference paper

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 6468))

Abstract

The central tenet of this paper is that by determining where people are looking, other tasks involved with understanding and interrogating a scene are simplified. To this end we describe a fully automatic method to determine a person’s attention based on real-time visual tracking of their head and a coarse classification of their head pose. We estimate the head pose, or coarse gaze, using randomised ferns with decision branches based on both histograms of gradient orientations and colour based features. We use the coarse gaze for three applications to demonstrate its value: (i) we show how by building static and temporally varying maps of areas where people look we are able to identify interesting regions; (ii) we show how by determining the gaze of people in the scene we can more effectively control a multi-camera surveillance system to acquire faces for identification; (iii) we show how by identifying where people are looking we can more effectively classify human interactions.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Robertson, N., Reid, I., Brady, J.: What are you looking at? gaze estimation in medium-scale images. In: Proc. HAREM Workshop (in assoc. with BMVC) (2005)

    Google Scholar 

  2. Robertson, N., Reid, I.D.: Estimating gaze direction from low-resolution faces in video. In: Leonardis, A., Bischof, H., Pinz, A. (eds.) ECCV 2006. LNCS, vol. 3952, pp. 402–415. Springer, Heidelberg (2006)

    Chapter  Google Scholar 

  3. Ono, Y., Okabe, T., Sato, Y.: Gaze estimation from low resolution images. In: Chang, L.-W., Lie, W.-N. (eds.) PSIVT 2006. LNCS, vol. 4319, pp. 178–188. Springer, Heidelberg (2006)

    Chapter  Google Scholar 

  4. Benfold, B., Reid, I.: Colour invariant head pose classification in low resolution video. In: Proceedings of the 19th British Machine Vision Conference (2008)

    Google Scholar 

  5. Benfold, B., Reid, I.: Guiding visual surveillance by tracking human attention. In: Proc. BMVC (2009)

    Google Scholar 

  6. Orozco, J., Gong, S., Xiang, T.: Head pose classification in crowded scenes. In: Proc. BMVC (2009)

    Google Scholar 

  7. Sommerlade, E., Benfold, B., Reid, I.: Gaze directed camera control for face image acquisition. under review. In: Intl Conf. on Robotics and Automation (2011)

    Google Scholar 

  8. Patron, A., Marszalek, M., Zisserman, A., Reid, I.: High five: Recognising human interactions in tv shows. In: Proc. BMVC (2010)

    Google Scholar 

  9. Dalal, N., Triggs, B.: Histograms of oriented gradients for human detection. In: IEEE CVPR, vol. 2, pp. 886–893 (2005)

    Google Scholar 

  10. Lucas, B.D., Kanade, T.: An iterative image registration technique with an application to stereo vision. In: IJCAI, pp. 674–679 (1981)

    Google Scholar 

  11. Prisacariu, V., Reid, I.: fastHOG - a real-time GPU implementation of HOG. Technical Report 2310/09, Dept of Engineering Science, Oxford University (2009)

    Google Scholar 

  12. Lepetit, V., Lagger, P., Fua, P.: Randomized trees for real-time keypoint recognition. In: CVPR, vol. 2, pp. 775–781. IEEE Computer Society, Los Alamitos (2005)

    Google Scholar 

  13. Sommerlade, E., Reid, I.: Probabilistic surveillance with multiple active cameras. In: Proc. IEEE Int’l Conf. on Robotics and Automation, pp. 440–445 (2010)

    Google Scholar 

  14. Tsochantaridis, I., Hofman, T., Joachims, T., Altun, Y.: Support vector machine learning for interdependent and structured output spaces. In: Proc. ICML (2004)

    Google Scholar 

  15. Ferrari, V., Marin-Jimenez, M., Zisserman, A.: Pose search: retrieving people using their pose. In: Proc. IEEE CVPR (2009)

    Google Scholar 

  16. Desai, C., Ramanan, D., Fowlkes, C.: Discriminative models for multi-class object layout. In: Proc. IEEE ICCV (2009)

    Google Scholar 

  17. Joachims, T., Finley, T., Yu, C.: Cutting plane training of structural svms. Machine Learning 77, 27–59 (2009)

    Article  MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2011 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Reid, I., Benfold, B., Patron, A., Sommerlade, E. (2011). Understanding Interactions and Guiding Visual Surveillance by Tracking Attention. In: Koch, R., Huang, F. (eds) Computer Vision – ACCV 2010 Workshops. ACCV 2010. Lecture Notes in Computer Science, vol 6468. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-22822-3_38

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-22822-3_38

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-22821-6

  • Online ISBN: 978-3-642-22822-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics