Human-Robot Interface Based on Speech Understanding Assisted by Vision

Chong, Shengshien; Kuno, Yoshinori; Shimada, Nobutaka; Shirai, Yoshiaki

doi:10.1007/3-540-40063-X_3

Shengshien Chong⁷,
Yoshinori Kuno^7,8,
Nobutaka Shimada⁷ &
…
Yoshiaki Shirai⁷

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 1948))

Included in the following conference series:

International Conference on Multimodal Interfaces

970 Accesses

Abstract

Speech recognition provides a natural and familiar interface for human beings to pass on information. For this, it is likely to be used as the human interface in service robots. However, in order for the robot to move in accordance to what the user tells it, there is a need to look at information other than those obtained from speech input. First, we look at the widely discussed problem in natural language processing of abbreviated communication of common context between parties. In addition to this, another problem exists for a robot, and that is the lack of information linking symbols in a robot’s world to things in a real world. Here, we propose a method of using image processing to make up for the information lacking in language processing that makes it insufficient to carry out the action. And when image processing fails, the robot will ask the user directly and use his/her answer to help it in achieving its task. We confirm our theories by performing experiments on both simulation and real robot and test their reliability.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

S. Wachsmuth and G. Sagerer, “Connecting concepts from vision and speech processing,” Proc. Workshop on Integration of Speech and Image Understanding, pp.1–19, 1999.
Google Scholar
N. Okada, “Towards affective integration of vision, behavior and speech processing,” Proc. Workshop on Integration of Speech and Image Understanding, pp.49–77, 1999.
Google Scholar
T. Takahashi, S. Nakanishi, Y. Kuno, and Y. Shirai, “Human-robot interface by verbal and nonverbal communication,” Proc. 1998 IEEE/RSJ International Conference on Intelligent Robots and Systems, pp.924–929, 1998.
Google Scholar
Y. Kuno, S. Nakanishi, T. Murashima, N. Shimada, and Y. Shirai, “Intelligent wheelchair based on the integration of human and environment observations,” Proc. 1999 IEEE International Conference on Information Intelligence and Systems, pp.342–349, 1999.
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer-Controlled Mechanical Systems, Osaka University, Japan
Shengshien Chong, Yoshinori Kuno, Nobutaka Shimada & Yoshiaki Shirai
Department of Information and Computer Sciencies, Saitama University, Japan
Yoshinori Kuno

Authors

Shengshien Chong
View author publications
You can also search for this author in PubMed Google Scholar
Yoshinori Kuno
View author publications
You can also search for this author in PubMed Google Scholar
Nobutaka Shimada
View author publications
You can also search for this author in PubMed Google Scholar
Yoshiaki Shirai
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Institute of Automation, Chinese Academy of Sciences, P.O.Box 2728, 100080, Beijing, China
Tieniu Tan
Computer Department, Media Laboratory, Tsinghua University, 100084, Beijing, China
Yuanchun Shi
Institute of Computing Technology, Chinese Academy of Sciences, P.O.Box 2704, 100080, Beijing, China
Wen Gao

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Chong, S., Kuno, Y., Shimada, N., Shirai, Y. (2000). Human-Robot Interface Based on Speech Understanding Assisted by Vision. In: Tan, T., Shi, Y., Gao, W. (eds) Advances in Multimodal Interfaces — ICMI 2000. ICMI 2000. Lecture Notes in Computer Science, vol 1948. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-40063-X_3

Download citation

DOI: https://doi.org/10.1007/3-540-40063-X_3
Published: 26 October 2001
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-41180-2
Online ISBN: 978-3-540-40063-9
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics