Social Interaction of Humanoid Robot Based on Audio-Visual Tracking

Okuno, Hiroshi G.; Nakadai, Kazuhiro; Kitano, Hiroaki

doi:10.1007/3-540-48035-8_70

Hiroshi G. Okuno^3,4,
Kazuhiro Nakadai³ &
Hiroaki Kitano^4,5

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 2358))

Included in the following conference series:

International Conference on Industrial, Engineering and Other Applications of Applied Intelligent Systems

1290 Accesses
14 Citations

Abstract

Social interaction is essential in improving robot human interface. Such behaviors for social interaction may include paying attention to a new sound source, moving toward it, or keeping face to face with a moving speaker. Some sound-centered behaviors may be difficult to attain, because the mixture of sounds is not well treated or auditory processing is too slow for real-time applications. Recently, Nakadai et al have developed real-time auditory and visual multiple-talker tracking technology by associating auditory and visual streams. The system is implemented on an upper-torso humanoid and the real-time talker tracking is attained with 200 msec of delay by distributed processing on four PCs connected by Gigabit Ethernet. Focus-of-attention is programmable and allows a variety of behaviors. The system demonstrates non-verbal social interaction by realizing a receptionist robot by focusing on an associated stream, while a companion robot on an auditory stream.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Breazeal, C., AND Scassellati, B. A context-dependent attention system for a social robot. Proceedints of the Sixteenth International Joint Conf. on Atificial Intelligence (IJCAI-99), 1146–1151.
Google Scholar
Breazeal, C. Emotive qualities in robot speech. Proc. of IEEE/RSJ International Conf. on Intelligent Robots and Systems (IROS-2001), 1389–1394.
Google Scholar
Brooks, R. A., Breazeal, C., Irie, R., Kemp, C. C., Marjanovic, M., Scassellati, B., AND Williamson, M. M. Alternative essences of intelligence. Proc. of 15th National Conf. on Artificial Intelligence (AAAI-98), 961–968.
Google Scholar
Horvitz, E., AND Paek, T. A computational architecture for conversation. Proc. of Seventh International Conf. on User Modeling (1999), Springer, 201–210.
Google Scholar
Kagami, S., Okada, K., Inaba, M., AND Inoue, H. Real-time 3d optical flow generation system. Proc. of International Conf. on Multisensor Fusion and Integration for Intelligent Systems (MFI’99), 237–242.
Google Scholar
Kawahara, T., Lee, A., Kobayashi, T., Takeda, K., Minematsu, N., Itou, K., Ito, A., Yamamoto, M., Yamada, A., Utsuro, T., AND Shikano, K. Japanese dictation toolkit-1997 version-. Journal of Acoustic Society Japan (E)20, 3 (1999), 233–239.
Google Scholar
Matsusaka, Y., Tojo, T., Kuota, S., Furukawa, K., Tamiya, D., Hayata, K., Nakano, Y., AND Kobayashi, T. Multi-person conversation via multi-modal interface — a robot who communicates with multi-user. Proc. of 6th European Conf. on Speech Communication Technology (EUROSPEECH-99), ESCA, 1723–1726.
Google Scholar
Nupakadai, K., Lourens, T., Okuno, H. G., AND Kitano, H. Active audition for humanoid. Proc. of 17th National Conf. on Artificial Intelligence (AAAI-2000), 832–839.
Google Scholar
Nakadai, K., Matsui, T., Okuno, H. G., AND Kitano, H. Active audition system and humanoid exterior design. Proc. of IEEE/RAS International Conf. on Intelligent Robots and Systems (IROS-2000), 1453–1461.
Google Scholar
Nakadai, K. Hidai, K., Mizoguchi, H., Okuno, H. G., AND Kitano, H. Realtime auditory and visual multiple-object tracking for robots. Proc. of the Seventeenth International Joint Conf. on Artificial Intelligence (IJCAI-01), 1425–1432.
Google Scholar
Okuno, H., Nakadai, K., Lourens, T., AND Kitano, H. Sound and visual tracking for humanoid robot. Proc. of Seventeenth International Conf. on Industrial and Engineering Applications of Artificial Intelligence and Expert Systems (IEA/AIE-2001) (Jun. 2001), LNAI 2070, Springer-Verlag, 640–650.
Google Scholar
Okuno, H. G., Nakatani, T., AND Kawabata, T. Listening to two simultaneous speeches. Speech Communication 27, 3–4 (1999), 281–298.
Google Scholar
Ono, T., Imai, M., AND Ishiguro, H. A model of embodied communications with gestures between humans and robots. Proc. of Twenty-third Annual Meeting of the Cognitive Science Society (CogSci2001), AAAI, 732–737.
Google Scholar
Waldherr, S., Thrun, S., Romero, R., AND Margaritis, D. Template-based recoginition of pose and motion gestures on a mobile robot. Proc. of 15th National Conf. on Artificial Intelligence (AAAI-98), 977–982.
Google Scholar
Wolfe, J., Cave, K. R., AND Franzel, S. Guided search: An alternative to the feature integration model for visual search. Journal of Experimental Psychology: Human Perception and Performance 15, 3 (1989), 419–433.
Article Google Scholar

Download references

Author information

Authors and Affiliations

Graduate School of Informatics, Kyoto University, Kyoto, 606-8501, Japan
Hiroshi G. Okuno & Kazuhiro Nakadai
Kitano Symbiotic Systems Project ERATO, Japan Science and Technolog Corp., Mansion 31 Suite 6A, 6-31-15 Jingumae, Shibuya, Tokyo, 150-0001, Japan
Hiroshi G. Okuno & Hiroaki Kitano
Sony Computer Science Laboratories, Inc., Shinagawa, Tokyo, 141-0022
Hiroaki Kitano

Authors

Hiroshi G. Okuno
View author publications
You can also search for this author in PubMed Google Scholar
Kazuhiro Nakadai
View author publications
You can also search for this author in PubMed Google Scholar
Hiroaki Kitano
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Centre for Intelligent Systems and Complex Processes, Swinburne University of Technology, John Street, Hawthorn, Victoria, Australia, 3122
Tim Hendtlass
Department of Computer Science, Southwest Texas State University, 601 University Drive, San Marcos, TX, 78666, USA
Moonis Ali

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Okuno, H.G., Nakadai, K., Kitano, H. (2002). Social Interaction of Humanoid Robot Based on Audio-Visual Tracking. In: Hendtlass, T., Ali, M. (eds) Developments in Applied Artificial Intelligence. IEA/AIE 2002. Lecture Notes in Computer Science(), vol 2358. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-48035-8_70

Download citation

DOI: https://doi.org/10.1007/3-540-48035-8_70
Published: 21 June 2002
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-43781-9
Online ISBN: 978-3-540-48035-8
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics