Skip to main content
Log in

Investigating shared attention with a virtual agent using a gaze-based interface

  • Original Paper
  • Published:
Journal on Multimodal User Interfaces Aims and scope Submit manuscript

Abstract

This paper investigates the use of a gaze-based interface for testing simple shared attention behaviours during an interaction scenario with a virtual agent. The interface is non-intrusive, operating in real-time using a standard web-camera for input, monitoring users’ head directions and processing them in real-time for resolution to screen coordinates. We use the interface to investigate user perception of the agent’s behaviour during a shared attention scenario. Our aim is to elaborate important factors to be considered when constructing engagement models that must account not only for behaviour in isolation, but also for the context of the interaction, as is the case during shared attention situations.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Similar content being viewed by others

References

  1. Picard RW (1997) Affective computing. MIT Press, Cambridge

    Google Scholar 

  2. Baron-Cohen S (1994) How to build a baby that can read minds: cognitive mechanisms in mind reading. Cah Psychol Cogn 13:513–552

    Google Scholar 

  3. Scassellati B (1996) Mechanisms of shared attention for a humanoid robot. In: Embodied cognition and action: papers from the 1996 AAAI fall symposium. AAAI, Menlo Park

    Google Scholar 

  4. Peters C, Castellano G, de Freitas S (2009) An exploration of user engagement in HCI. In: Proceedings of the affective-aware virtual agents and social robots (AFFINE) workshop, international conference on multimodal interfaces (ICMI’09). ACM, Cambridge

    Google Scholar 

  5. El Kaliouby R, Robinson P (2005) Generalization of a vision-based computational model of mind-reading. In: ACII 2005: proceedings of the first international conference on affective computing and intelligent interaction, pp 582–589

  6. Mota S, Picard RW (2003) Automated posture analysis for detecting learner’s interest level. In: Computer vision and pattern recognition workshop, vol 5, p 49. IEEE Comput Soc, Los Alamitos

    Chapter  Google Scholar 

  7. Castellano G, Pereira A, Leite I, Paiva A, McOwan PW (2009) Detecting user engagement with a robot companion using task and social interaction-based features. In: International conference on multimodal interfaces. ACM, Cambridge

    Google Scholar 

  8. Kapoor A, Picard RW (2005) Multimodal affect recognition in learning environments. In: ACM conference on multimedia, November 2005

  9. Langton S, Watt R, Bruce V (2000) Do the eyes have it? Cues to the direction of social attention. Trends Cogn Sci 4:50–59

    Article  Google Scholar 

  10. Emery NJ (2000) The eyes have it: the neuroethology, function and evolution of social gaze. Neurosci Biobehav Rev 24(6):581–604

    Article  Google Scholar 

  11. Hoffman MW, Grimes DB, Shon AP, Rao RPN (2006) A probabilistic model of gaze imitation and shared attention. Neural Netw 19(3):299–310

    Article  MATH  Google Scholar 

  12. Breazeal C, Scassellati B (2002) Challenges in building robots that imitate people. In: Dautenhahn K, Nehaniv CL (eds) Imitation in animals and artifacts. MIT Press, Cambridge, pp 363–390

    Google Scholar 

  13. Prendinger H, Eichner T, André E, Ishizuka M (2007) Gaze-based infotainment agents. In: Advances in computer entertainment technology, pp 87–90

  14. Ishii R, Nakano YI (2008) Estimating user’s conversational engagement based on gaze behaviors. In: Prendinger H, Lester JC, Ishizuka M (eds) Intelligent virtual agents, 8th international conference, IVA. Lecture notes in computer science, vol 5208. Springer, Berlin, pp 200–207

    Google Scholar 

  15. Peters C, Asteriadis S, Karpouzis K, de Sevin E (2008) Towards a real-time gaze-based shared attention for a virtual agent. In: International conference on multimodal interfaces (ICMI), workshop on affective interaction in natural environments, AFFINE, Chania, Greece

  16. Bevacqua E, Mancini M, Pelachaud C (2008) A listening agent exhibiting variable behaviour. In: Intelligent virtual agents (IVA), Tokyo

  17. Voit M, Nickel K, Stiefelhagen R (2005) Multi-view head pose estimation using neural networks. In: Second Canadian conference on computer and robot vision (CRV), Victoria, BC, Canada. IEEE Comput Soc, Los Alamitos, pp 347–352

    Chapter  Google Scholar 

  18. Mao Y, Suen CY, Sun C, Feng C (2007) Pose estimation based on two images from different views. In: Eighth IEEE workshop on applications of computer vision (WACV). IEEE Comput Soc, Washington, p 9

    Chapter  Google Scholar 

  19. Beymer D, Flickner M (2003) Eye gaze tracking using an active stereo head. In: IEEE computer society conference on computer vision and pattern recognition (CVPR), vol 2, Madison, WI, USA, 2003. IEEE Comput Soc, Los Alamitos, pp 451–458

    Google Scholar 

  20. Meyer A, Böhme M, Martinetz T, Barth E (2006) A single-camera remote eye tracker. In: Lecture notes in artificial intelligence. Springer, Berlin, pp 208–211

    Google Scholar 

  21. Hennessey C, Noureddin B, Lawrence PD (2006) A single camera eye-gaze tracking system with free head motion. In: Proceedings of the eye tracking research & application symposium (ETRA), San Diego, California, USA, 2006. ACM, New York, pp 87–94

    Chapter  Google Scholar 

  22. Gee A, Cipolla R (1994) Non-intrusive gaze tracking for human-computer interaction. In: Int conference on mechatronics and machine vision in pract, pp 112–117, Toowoomba, Australia

  23. Gourier N, Hall D, Crowley J (2004) Estimating face orientation from robust detection of salient facial features. In: International workshop on visual observation of deictic gestures (ICPR), Cambridge, UK

  24. Seo K, Cohen I, You S, Neumann U (2004) Face pose estimation system by combining hybrid ica-svm learning and re-registration. In: 5th Asian conference on computer vision, Jeju, Korea

  25. Stiefelhagen R (2004) Estimating head pose with neural networks—results on the pointing, 04 ICPR workshop evaluation data. In: Pointing 04 workshop (ICPR), Cambridge, UK, August 2004

  26. Cascia ML, Sclaroff S, Athitsos V (2000) Fast, reliable head tracking under varying illumination: an approach based on robust registration of texture-mapped 3d models. IEEE Trans Pattern Anal Mach Intell 22:322–336

    Article  Google Scholar 

  27. Cootes T, Walker K, Taylor C (2000) View-based active appearance models. In: Fourth IEEE international conference on automatic face and gesture recognition, pp 227–232

  28. Sung J, Kanade T, Kim D (2008) Pose robust face tracking by combining active appearance models and cylinder head models. Int J Comput Vis 80(2):260–274

    Article  Google Scholar 

  29. Morency L-P, Whitehill J, Movellan J (2008) Generalized adaptive view-based appearance model: integrated framework for monocular head pose estimation. In: Proceedings IEEE international conference on face and gesture recognition

  30. Whitehill J, Movellan JR (2008) A discriminative approach to frame-by-frame head pose tracking. In: Proceedings IEEE international conference on face and gesture recognition, pp 1–7

  31. Asteriadis S, Nikolaidis N, Pitas I, Pardàs M (2007) Detection of facial characteristics based on edge information. In: Second international conference on computer vision theory and applications (VISAPP), vol 2, Barcelona, Spain, pp 247–252

  32. Asteriadis S, Tzouveli P, Karpouzis K, Kollias S (2007) Non-verbal feedback on user interest based on gaze direction and head pose. In: 2nd international workshop on semantic media adaptation and personalization (SMAP), London, United Kingdom, December, 2007

  33. Lucas BD, Kanade T (1981) An iterative image registration technique with an application to stereo vision (IJCAI). In: Proceedings of the 7th international joint conference on artificial intelligence (IJCAI ’81), pp 674–679, April 1981

  34. Peters C (2006) A perceptually-based theory of mind model for agent interaction initiation. In: International journal of humanoid robotics (IJHR), special issue: achieving human-like qualities in interactive virtual and physical humanoids. World Scientific, Singapore, pp 321–340

    Google Scholar 

  35. Emery NJ, Perrett DI (1994) Understanding the intentions of others from visual signals: neurophysiological evidence. Curr Psychol Cogn 13:683–694

    Google Scholar 

  36. Sidner CL, Kidd CD, Lee C, Lesh N (2004) Where to look: a study of human-robot interaction. In: Intelligent user interfaces conference. ACM, New York, pp 78–84

    Google Scholar 

  37. Conte R, Castelfranchi C (1995) Cognitive and social action. University College London, London

    Google Scholar 

  38. Poggi I (2007) Mind, hands, face and body. A goal and belief view of multimodal communication. Weidler, Berlin.

    Google Scholar 

  39. Congalton RG, Green K (1999) Assessing the accuracy of remotely sensed data: principles and practices. Lewis Publishers, Boca Raton

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Christopher Peters.

Electronic Supplementary Material

Below is the link to the electronic supplementary material. (AVI 3.25 MB)

Rights and permissions

Reprints and permissions

About this article

Cite this article

Peters, C., Asteriadis, S. & Karpouzis, K. Investigating shared attention with a virtual agent using a gaze-based interface. J Multimodal User Interfaces 3, 119–130 (2010). https://doi.org/10.1007/s12193-009-0029-1

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s12193-009-0029-1

Navigation