Abstract
Speech recognition provides a natural and familiar interface for human beings to pass on information. For this, it is likely to be used as the human interface in service robots. However, in order for the robot to move in accordance to what the user tells it, there is a need to look at information other than those obtained from speech input. First, we look at the widely discussed problem in natural language processing of abbreviated communication of common context between parties. In addition to this, another problem exists for a robot, and that is the lack of information linking symbols in a robot’s world to things in a real world. Here, we propose a method of using image processing to make up for the information lacking in language processing that makes it insufficient to carry out the action. And when image processing fails, the robot will ask the user directly and use his/her answer to help it in achieving its task. We confirm our theories by performing experiments on both simulation and real robot and test their reliability.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
S. Wachsmuth and G. Sagerer, “Connecting concepts from vision and speech processing,” Proc. Workshop on Integration of Speech and Image Understanding, pp.1–19, 1999.
N. Okada, “Towards affective integration of vision, behavior and speech processing,” Proc. Workshop on Integration of Speech and Image Understanding, pp.49–77, 1999.
T. Takahashi, S. Nakanishi, Y. Kuno, and Y. Shirai, “Human-robot interface by verbal and nonverbal communication,” Proc. 1998 IEEE/RSJ International Conference on Intelligent Robots and Systems, pp.924–929, 1998.
Y. Kuno, S. Nakanishi, T. Murashima, N. Shimada, and Y. Shirai, “Intelligent wheelchair based on the integration of human and environment observations,” Proc. 1999 IEEE International Conference on Information Intelligence and Systems, pp.342–349, 1999.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2000 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Chong, S., Kuno, Y., Shimada, N., Shirai, Y. (2000). Human-Robot Interface Based on Speech Understanding Assisted by Vision. In: Tan, T., Shi, Y., Gao, W. (eds) Advances in Multimodal Interfaces — ICMI 2000. ICMI 2000. Lecture Notes in Computer Science, vol 1948. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-40063-X_3
Download citation
DOI: https://doi.org/10.1007/3-540-40063-X_3
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-41180-2
Online ISBN: 978-3-540-40063-9
eBook Packages: Springer Book Archive