Skip to main content

Visual Estimation of Attentive Cues in HRI: The Case of Torso and Head Pose

  • Conference paper
  • First Online:

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 9163))

Abstract

Capturing visual human-centered information is a fundamental input source for effective and successful human-robot interaction (HRI) in dynamic multi-party social settings. Torso and head pose, as forms of nonverbal communication, support the derivation people’s focus of attention, a key variable in the analysis of human behaviour in HRI paradigms encompassing social aspects. Towards this goal, we have developed a model-based approach for torso and head pose estimation to overcome key limitations in free-form interaction scenarios and issues of partial intra- and inter-person occlusions. The proposed approach builds up on the concept of Top View Re-projection (TVR) to uniformly treat the respective body parts, modelled as cylinders. For each body part a number of pose hypotheses is sampled from its configuration space. Each pose hypothesis is evaluated against the a scoring function and the hypothesis with the best score yields for the assumed pose and the location of the joints. A refinement step on head pose is applied based on tracking facial patch deformations to compute for the horizontal off-plane rotation. The overall approach forms one of the core component of a vision system integrated in a robotic platform that supports socially appropriate, multi-party, multimodal interaction in a bartending scenario. Results in the robot’s environment during real HRI experiments with varying number of users attest for the effectiveness of our approach.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

  1. Baltzakis, H., Trahanias, P.: Hybrid mobile robot localization using switching state-space models. In: Proceedings of IEEE International Conference on Robotics and Automation (ICRA), pp. 366–373 (2002)

    Google Scholar 

  2. Tsonis, V.S., Chandrinos, K.V., Trahanias, P.E.: Landmark-based navigation using projective invariants. In: Proceedings of IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 342–347 (1998)

    Google Scholar 

  3. Baltzakis, H., Argyros, A.A., Lourakis, M.I.A., Trahanias, P.: Tracking of human hands and faces through probabilistic fusion of multiple visual cues. In: Gasteratos, A., Vincze, M., Tsotsos, J.K. (eds.) ICVS 2008. LNCS, vol. 5008, pp. 33–42. Springer, Heidelberg (2008)

    Chapter  Google Scholar 

  4. Sigalas, M., Baltzakis, H., Trahanias, P.: Gesture recognition based on arm tracking for human-robot interaction. In: Proceedings of IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 5424–5429 (2010)

    Google Scholar 

  5. Langton, S.R., Honeyman, H., Tessler, E.: The influence of head contour and nose angle on the perception of eye-gaze direction. Percept. Psychophysics 66(5), 752–771 (2004)

    Article  Google Scholar 

  6. Moeslund, T.B., Hilton, A., Kruger, V., Sigal, L. (eds.): Visual Analysis of Humans - Looking at People. Springer, London (2011)

    Google Scholar 

  7. Murphy-Chutorian, E., Trivedi, M.: Head pose estimation in computer vision: a survey. IEEE Trans. Pattern Anal. Mach. Intell. 31, 607–626 (2009)

    Article  Google Scholar 

  8. Microsoft kinect for xbox 360

    Google Scholar 

  9. Escalera, S.: Human behavior analysis from depth maps. In: Perales, F.J., Fisher, R.B., Moeslund, T.B. (eds.) AMDO 2012. LNCS, vol. 7378, pp. 282–292. Springer, Heidelberg (2012)

    Chapter  Google Scholar 

  10. Shotton, J., et al.: Efficient human pose estimation from single depth images. IEEE Trans. Pattern Anal. Mach. Intell. 35, 2821–2840 (2013)

    Article  Google Scholar 

  11. Fanelli, G., Gall, J., Gool, L.V.: Real time head pose estimation with random regression forests. In: Proceedings on Computer Vision and Pattern Recognition (CVPR), pp. 617–624 (2011)

    Google Scholar 

  12. Cai, Q., Gallup, D., Zhang, C., Zhang, Z.: 3d deformable face tracking with a commodity depth camera. In: Proceedings of the 11th European Conference on Computer Vision: Part III. ECCV 2010, pp. 229–242. Springer-Verlag, Heidelberg (2010)

    Google Scholar 

  13. Zhu, Y., Fujimura, K.: Constrained optimization for human pose estimation from depth sequences. In: Yagi, Y., Kang, S.B., Kweon, I.S., Zha, H. (eds.) ACCV 2007, Part I. LNCS, vol. 4843, pp. 408–418. Springer, Heidelberg (2007)

    Chapter  Google Scholar 

  14. Ye, M., Wang, X., Yang, R., Ren, L., Pollefeys, M.: Accurate 3d pose estimation from a single depth image. In: IEEE International Conference on Computer Vision (ICCV), pp. 731–738 (2011)

    Google Scholar 

  15. Yang, R., Zhang, Z.: Model-based head pose tracking with stereovision. In: Proceedings of IEEE International Conference on Automatic Face and Gesture Recognition, pp. 255–260 (2001)

    Google Scholar 

  16. Sigalas, M., Pateraki, M., Trahanias, P.: Robust articulated upper body pose tracking under severe occlusions. In: Proceedings of IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 4104–4111 (2014)

    Google Scholar 

  17. NASA: Man-systems integration standards - revision b (1995)

    Google Scholar 

  18. Stenger, B., Thayananthan, A., Torr, P.H., Cipolla, R.: Model-based hand tracking using a hierarchical bayesian filter. IEEE Trans. Pattern Anal. Mach. Intell. 28, 1372–1384 (2006)

    Article  Google Scholar 

  19. Pateraki, M., Baltzakis, H., Trahanias, P.: Visual estimation of pointed targets for robot guidance via fusion of face pose and hand orientation. Comput. Vision Image Underst. 120, 1–13 (2014)

    Article  Google Scholar 

  20. Pateraki, M., Baltzakis, H., Trahanias, P.: Using dempster’s rule of combination to robustly estimate pointed targets. In: IEEE International Conference on Robotics and Automation (ICRA), pp. 1218–1225 (2012)

    Google Scholar 

  21. Giuliani, M., et al.: Comparing task-based and socially intelligent behaviour in a robot bartender. In: Proceedings of the 15th ACM on International Conference on Multimodal Interaction (ICMI), pp. 263–270, New York (2013)

    Google Scholar 

  22. Foster, M., et al.: Two people walk into a bar: dynamic multi-party social interaction with a robot agent. In: Proceedings of the 14th ACM International Conference on Multimodal Interaction (ICMI), pp. 3–10 (2012)

    Google Scholar 

Download references

Acknowledgments

This work was partially supported by the European Commission under contract number FP7-270435 (JAMES project).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Markos Sigalas .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer International Publishing Switzerland

About this paper

Cite this paper

Sigalas, M., Pateraki, M., Trahanias, P. (2015). Visual Estimation of Attentive Cues in HRI: The Case of Torso and Head Pose. In: Nalpantidis, L., Krüger, V., Eklundh, JO., Gasteratos, A. (eds) Computer Vision Systems. ICVS 2015. Lecture Notes in Computer Science(), vol 9163. Springer, Cham. https://doi.org/10.1007/978-3-319-20904-3_34

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-20904-3_34

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-20903-6

  • Online ISBN: 978-3-319-20904-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics