Skip to main content

Multimodal Interaction Abilities for a Robot Companion

  • Conference paper
Computer Vision Systems (ICVS 2008)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 5008))

Included in the following conference series:

  • 2723 Accesses

Abstract

Among the cognitive abilities a robot companion must be endowed with, human perception and speech understanding are both fundamental in the context of multimodal human-robot interaction. In order to provide a mobile robot with the visual perception of its user and means to handle verbal and multimodal communication, we have developed and integrated two components. In this paper we will focus on an interactively distributed multiple object tracker dedicated to two-handed gestures and head location in 3D. Its relevance is highlighted by in- and off- line evaluations from data acquired by the robot. Implementation and preliminary experiments on a household robot companion, including speech recognition and understanding as well as basic fusion with gesture, are then demonstrated. The latter illustrate how vision can assist speech by specifying location references, object/person IDs in verbal statements in order to interpret natural deictic commands given by human beings. Extensions of our work are finally discussed.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Similar content being viewed by others

References

  1. Kawahara, T., Lee, A., Shikano, K.: Julius — an open source real-time large vocabulary recognition engine. In: European Conference on Speech Communication and Technology (EUROSPEECH), pp. 1691–1694 (2001)

    Google Scholar 

  2. Clodic, A., Montreuil, V., Alami, R., Chatila, R.: A decisional framework for autonomous robots interacting with humans. In: IEEE International Workshop on Robot and Human Interactive Communication (RO-MAN) (2005)

    Google Scholar 

  3. Fong, T., Nourbakhsh, I., Dautenhahn, K.: A survey of socially interactive robots. Robotics and Autonomous Systems 42, 143–166 (2003)

    Article  MATH  Google Scholar 

  4. Isard, M., Blake, A.: CONDENSATION – conditional density propagation for visual tracking. Int. Journal on Computer Vision 29(1), 5–28 (1998)

    Article  Google Scholar 

  5. Isard, M., Blake, A.: I-CONDENSATION: Unifying low-level and high-level tracking in a stochastic framework. In: European Conf. on Computer Vision, 1998, pp. 893–908 (1998)

    Google Scholar 

  6. Isard, M., Blake, A.: BraMBLe: a bayesian multiple blob tracker. In: Int. Conf. on Computer Vision, Vancouver, pp. 34–41 (2001)

    Google Scholar 

  7. Maas, J., Spexard, T., Fritsch, J., Wrede, B., Sagerer, G.: A multi-modal topic tracker for improved human-robot interaction. In: Int. Symp. on Robot and Human Interactive Communication, Hatfield (September 2006)

    Google Scholar 

  8. Nickel, K., Stiefehagen, R.: Visual recognition of pointing gestures for human-robot interaction. Image and Vision Computing 3(12), 1875–1884 (2006)

    Google Scholar 

  9. Pérennou, G., de Calmès, M.: MHATLex: Lexical resources for modelling the french pronunciation. In: Int. Conf. on Language Resources and Evaluations, Athens, June 2000, pp. 257–264 (2000)

    Google Scholar 

  10. Rogalla, O., Ehrenmann, M., Zollner, R., Becher, R., Dillman, R.: Advanced in human-robot interaction. In: Using gesture and speech control for commanding a robot., vol. 14, Springer, Heidelberg (2004)

    Google Scholar 

  11. Lerasle, F., Germa, T., Brèthes, L., Simon, T.: Data fusion and eigenface based tracking dedicated to a tour-guide robot. In: Int. Conf. on Computer Vision Systems (2007)

    Google Scholar 

  12. Wei, Q., Schonfeld, D., Mohamed, M.: Real-time interactively distributed multi-object tracking using a magnetic-inertia potential model. In: Int. Conf. on Computer Vision, Beijing, October 2005, pp. 535–540 (2005)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Antonios Gasteratos Markus Vincze John K. Tsotsos

Rights and permissions

Reprints and permissions

Copyright information

© 2008 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Burger, B., Ferrané, I., Lerasle, F. (2008). Multimodal Interaction Abilities for a Robot Companion. In: Gasteratos, A., Vincze, M., Tsotsos, J.K. (eds) Computer Vision Systems. ICVS 2008. Lecture Notes in Computer Science, vol 5008. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-79547-6_53

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-79547-6_53

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-79546-9

  • Online ISBN: 978-3-540-79547-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics