Skip to main content

Tracking Body Parts of Multiple People for Multi-person Multimodal Interface

  • Conference paper
Computer Vision in Human-Computer Interaction (HCI 2005)

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 3766))

Included in the following conference series:

Abstract

Although large displays could allow several users to work together and to move freely in a room, their associated interfaces are limited to contact devices that must generally be shared. This paper describes a novel interface called SHIVA (Several-Humans Interface with Vision and Audio) allowing several users to interact remotely with a very large display using both speech and gesture. The head and both hands of two users are tracked in real time by a stereo vision based system. From the body parts position, the direction pointed by each user is computed and selection gestures done with the second hand are recognized. Pointing gesture is fused with n-best results from speech recognition taking into account the application context. The system is tested on a chess game with two users playing on a very large display.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Bellik, Y.: Technical Requirements for a Successful Multimodal Interaction. In: International Workshop on Information Presentation and Natural Multimodal Dialogue, Verona, Italy (2001)

    Google Scholar 

  2. Bolt, R.A.: Put-that-there: Voice and gesture at the graphics interface. In: Proceedings of the 7th Annual Conference on Computer Graphics and Interactive Techniques, Seattle, Washington, pp. 262–270 (1980)

    Google Scholar 

  3. Carbini, S., Viallet, J.E., Bernier, O.: Pointing Gesture Visual Recognition for Large Display. In: Pointing 2004 ICPR Workshop, Cambridge (2004)

    Google Scholar 

  4. Chang, T.H., Gong, S.: Tracking multiple people with a multi-camera system. In: IEEE Workshop on Multi-Object Tracking, Vancouver, Canada, pp. 19–26 (2001)

    Google Scholar 

  5. Checka, N., Wilson, K., Siracusa, M., Darrell, T.: Multiple Person and Speaker Activity Tracking with a Particle Filter. In: ICASSP, Montreal, Canada (2004)

    Google Scholar 

  6. Demirdjian, D., Darrell, T.: 3-D Articulated Pose Tracking for Untethered Diectic Reference. In: Proceedings of International Conference on Multimodal Interfaces, Pittsburgh, Pennsylvania, p. 267 (2002)

    Google Scholar 

  7. Eisenstein, J., Christoudias, C.M.: A Salience-Based Approach to Gesture- Speech Alignement, HLT-NAACL, pp. 25-32, Boston, Massachusetts (2004)

    Google Scholar 

  8. Feraud, R., Bernier, O., Viallet, J.E., Collobert, M.: A fast and accurate face detector based on neural networks. PAMI 23 (1), 42–53 (2001)

    Google Scholar 

  9. Jojic, N., Brumitt, B., Meyers, B., Harris, S.: Detecting and Estimating Pointing Gestures in Dense Disparity Maps. In: IEEE International Conference on Face and Gesture recognition, Grenoble, France, p. 468 (2000)

    Google Scholar 

  10. Kehl, R., Van Gool, L.: Real-time Pointing Gesture Recognition for an Immersive Environment. In: IEEE International Conference on Automatic Face and Gesture Recognition, Seoul, Korea, pp. 577–582 (2004)

    Google Scholar 

  11. Kettebekov, S., Sharma, R.: Understanding Gestures in a Multimodal Human Computer Interaction. International Journal of Artificial Intelligence Tools 9 (2), 205–223 (2000)

    Article  Google Scholar 

  12. Krahnstoever, N., Kettebekov, S., Yeasin, M., Sharma, R.: A Real-Time Framework for Natural Multimodal Interaction with Large Screen Displays. In: International Conference on Multimodal Interfaces, Pittsburgh, Pennsylvania, p. 349 (2002)

    Google Scholar 

  13. Nickel, K., Seemann, E., Stiefelhagen, R.: 3D-Tracking of Head and Hands for Pointing Gesture Recognition in a Human-Robot Interaction Scenario. In: IEEE International Conference on Automatic Face and Gesture Recognition, Seoul, Korea, p. 565 (2004)

    Google Scholar 

  14. Oviatt, S., De Angeli, A., Kuhn, K.: Integration and synchronization of input modes during multimodal human-computer interaction. In: CHI 1997: Proceedings of the SIGCHI Conference on Human factors in computing systems, Atlanta, Georgia, pp. 415–422 (1997)

    Google Scholar 

  15. Polat, E., Yeasin, M., Sharma, R.: A Tracking Framework for Collaborative Human Computer Interaction. In: International Conference on Multimodal Interfaces, Pittsburgh, Pennsylvania, pp. 27–32 (2002)

    Google Scholar 

  16. Sato, K., Aggarwal, J.K.: Tracking and recognizing two-person interactions in outdoor image sequences. In: Workshop on Multi-Object Tracking, Vancouver, Canada, pp. 87–94 (2001)

    Google Scholar 

  17. Yamamoto, Y., Yoda, I., Sakaue, K.: Arm-Pointing Gesture Interface Using Surrounded Stereo Cameras System. In: ICPR (International Conference on Pattern Recognition), Cambridge, UK, pp. 965–970 (2004)

    Google Scholar 

  18. Xboard, http://www.tim-mann.org/xboard.html

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2005 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Carbini, S., Viallet, JE., Bernier, O., Bascle, B. (2005). Tracking Body Parts of Multiple People for Multi-person Multimodal Interface. In: Sebe, N., Lew, M., Huang, T.S. (eds) Computer Vision in Human-Computer Interaction. HCI 2005. Lecture Notes in Computer Science, vol 3766. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11573425_2

Download citation

  • DOI: https://doi.org/10.1007/11573425_2

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-29620-1

  • Online ISBN: 978-3-540-32129-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics