Skip to main content

Tasking robots through multimodal interfaces: The “Coach Metaphor”

  • Conference paper
  • First Online:
Collective Robotics (CRW 1998)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 1456))

Included in the following conference series:

Abstract

This paper presents multimodal interfaces to task and control multiple robots controlled by an agent-based architecture. For the past few years, SRI International have followed an approach based on the “Coach Metaphor”. In sports or business, coaches are meant to apply predefined strategies to their teams, or, if something goes wrong, to find new means and plans during an ongoing game, so as to retask either the entire team or specific players. This is also the challenge facing a robot's operator. SRI's agent-based framework, the Open Agent ArchitectureTM (OAA), provides communication between the members of a team and the external world. The coach, or the robot's operator, who is an active member of the team, is provided with a multimodal interface that uses pen and voice. The analogy of a coach talking and drawing on a white clipboard representing the virtual world where the players are developing their game reinforces the metaphor. We present several interfaces specifically developed for SRI's robots, and we show an example (controlling robots on a soccer field) where the metaphor matches, one to one, the real world. To clarify our views, we will give an overview of the technologies in use, such as the agent architecture, the speech and gesture recognizers, and the robot controller.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Bolt, R. Put-That-There: Voice and Gesture at the graphics interface. Computer Graphics, Vol. 14, Number 3, pp. 262–270, 1980.

    Google Scholar 

  2. Cheyer, A. and Julia, L. Multimodal maps: An agent-based approach. In Proc. of CMC'95, pp. 103–113, Eindhoven, The Netherlands, May 1995.

    Google Scholar 

  3. Cheyer, A. and Julia, L. MVIEWS: Multimodal Tools for the Video Analyst. In Proc. of IUI'98, pp 55–62, San Francisco, USA, January 1998.

    Google Scholar 

  4. Cheyer, A., Julia, L. and Martin. J.C. A Unified Framework for Constructing Multimodal Experiments and Applications. In Proc. of CMC'98, pp. 63–69, Tilburg, The Netherlands, January 1998

    Google Scholar 

  5. Digalakis, V., Monaco, P. and Murveit, H. Genones: Generalized Mixture Tying in Continuous Hidden Markov Model-Based Speech Recognizers. IEEE Transactions of Speech and Audio Processing, Vol.4, Num. 4, p 281, 1996.

    Google Scholar 

  6. Dowding, J., Gawron, J., M. Appelt, D., Bear, J., Cherny, L., Moore, R. and Moran, D. GEMINI: A natural language system for spoken-language understanding. 31st Annual Meeting of the Association for Computational Linguistics. Pp. 54–61. Colombus, USA, 1996

    Google Scholar 

  7. Guzzoni, D., Cheyer, A., Julia, L. and Konolige, K. Many Robots Make Short Work. AI Magazine, Vol. 18, Number 1, pp. 55–64, Spring 1997.

    Google Scholar 

  8. Hobbs, J., Appelt, D., Bear, J., Israel, D., Kameyama, M., Stickel, M., and Tyson, M. FASTUS: a cascaded finite-state transducer for extracting information from natural-language text. in Finite State Devices for Natural Language Processing (E. Roche and Y. Schabes, eds.) MIT Press, Cambridge, USA, 1996.

    Google Scholar 

  9. Julia, L. and Faure, C. A multimodai interface for incremental graphic document design, In Proc. HCI'93, p 186, Orlando, USA, August 1993.

    Google Scholar 

  10. Julia, L. and Faure, C. Pattern recognition and beautification for a pen-based interface. In Proc. of ICDAR'95, pp. 58–63, Montreal, Canada, August 1995.

    Google Scholar 

  11. Julia, L. and Cheyer, A. Speech: A Privileged Modality. In Proc. ofEuroSpeech'97, Vol. 4, pp. 1843–1846, Rhodes, Greece, September 1997

    Google Scholar 

  12. Konolige, K., Myers, K., Ruspini, E. and Saffiotti, A. The SAPHIRA Architecture: A Design for Autonomy. Journal ofExperimental and Theoretical AI, Vol. 4, Number 0, pp. ?-?, ? 1997

    Google Scholar 

  13. Martin, J.C., Julia, L. and Cheyer, A. A Theoretical Framework for Multimodal User Studies. In Proc. CMC'98, pp. 104–110, Tilburg, the Netherlands, January 1998.

    Google Scholar 

  14. Mellor, B.A., Baber, C. and Tunley, C. In goal-oriented multimodal dialogue systems. In Proc. ICSLP'96, pp. 1668–1671, Philadelphia, USA, 1996.

    Google Scholar 

  15. Moran, D., Cheyer, A., Julia, L. and Park, S. Multimodal user interfaces in the Open Agent Architecture. In Proc. ofIUI'97, pp. 61–68. Orlando, January 1997.

    Google Scholar 

  16. Siroux, J., Guyomard, M., Jolly, Y., Multon, F. and Remondeau, C. Speech and Tactile-Based Georal System. In Proc. EUROSPEECH'95, pp. 1943–1946, Madrid, Spain, 1995.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Alexis Drogoul Milind Tambe Toshio Fukuda

Rights and permissions

Reprints and permissions

Copyright information

© 1998 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Julia, L. (1998). Tasking robots through multimodal interfaces: The “Coach Metaphor”. In: Drogoul, A., Tambe, M., Fukuda, T. (eds) Collective Robotics. CRW 1998. Lecture Notes in Computer Science, vol 1456. Springer, Berlin, Heidelberg. https://doi.org/10.1007/BFb0033372

Download citation

  • DOI: https://doi.org/10.1007/BFb0033372

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-64768-3

  • Online ISBN: 978-3-540-68723-8

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics