A Multi-modal Approach for Natural Human-Robot Interaction

Kollar, Thomas; Vedantham, Anu; Sobel, Corey; Chang, Cory; Perera, Vittorio; Veloso, Manuela

doi:10.1007/978-3-642-34103-8_46

Thomas Kollar²³,
Anu Vedantham²³,
Corey Sobel²³,
Cory Chang²³,
Vittorio Perera²³ &
…
Manuela Veloso²³

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 7621))

Included in the following conference series:

International Conference on Social Robotics

7429 Accesses

Abstract

We present a robot that is able to interact with people in a natural, multi-modal way by using both speech and gesture. The robot is able to track people, process speech and understand language. To track people and recognize gestures, the robot uses an RGB-D sensor (e.g., a Microsoft Kinect). To recognize speech, the robot uses a cloud-based service. To understand language, the robot uses a probabilistic graphical model to infer the meaning of a natural language query. We have evaluated our system in two domains. The first domain is a robot receptionist (roboceptionist); we show that the roboceptionist is able to interact successfully with people 77% of the time when people are primed with the capabilities of the robot compared to 57% when people are not primed with its capabilities. The second domain is a mobile service robot, which is able to interact with people via natural language.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Multimodal Perceptual Cues for Context-Aware Human-Robot Interaction

Voice controlled humanoid robot

Article 14 November 2023

The MuMMER Project: Engaging Human-Robot Interaction in Real-World Public Spaces

References

Rosenthal, S., Biswas, J., Veloso, M.: An effective personal mobile robot agent through symbiotic human-robot interaction. In: International Conference on Autonomous Agents and Multiagent Systems, pp. 915–922 (2010)
Google Scholar
Gockley, R., Bruce, A., Forlizzi, J., Michalowski, M., Mundell, A., Rosenthal, S., Sellner, B., Simmons, R., Snipes, K., Schultz, A., Wang, J.: Designing robots for long-term social interaction. In: IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 1338–1343. IEEE (2005)
Google Scholar
Makatchev, M., Fanaswala, I., Abdulsalam, A., Browning, B., Ghazzawi, W., Sakr, M., Simmons, R.: Dialogue patterns of an arabic robot receptionist. In: Proceedings of Human-Robot Interaction, pp. 167–168 (2010)
Google Scholar
Salichs, M., Barber, R., Khamis, A., Malfaz, M., Gorostiza, J., Pacheco, R., Rivas, R., Corrales, A., Delgado, E., Garcia, D.: Maggie: A robotic platform for human-robot social interaction. In: IEEE Conference on Robotics, Automation and Mechatronics, pp. 1–7 (June 2006)
Google Scholar
Bohus, D., Horvitz, E.: Facilitating multiparty dialog with gaze, gesture, and speech. In: International Conference on Multimodal Interfaces, pp. 5:1–5:8 (2010)
Google Scholar
Wu, Y., Huang, T.S.: Vision-Based Gesture Recognition: A Review. In: Braffort, A., Gibet, S., Teil, D., Gherbi, R., Richardson, J. (eds.) GW 1999. LNCS (LNAI), vol. 1739, pp. 103–115. Springer, Heidelberg (2000)
Chapter Google Scholar
Eisenstein, J.: Gesture in automatic discourse processing. Ph.D. dissertation (2008)
Google Scholar
Christoudias, C.M., Saenko, K., Morency, L.-P., Darrell, T.: Co-adaptation of audio-visual speech and gesture classifiers. In: International Conference on Multimodal Interfaces, pp. 84–91 (2006)
Google Scholar
Scassellati, B.: Imitation and mechanisms of joint attention: a developmental structure for building social skills on a humanoid robot, pp. 176–195 (1999)
Google Scholar
Breazeal, C.L.: Sociable machines: expressive social exchange between humans and robots. Ph.D. dissertation (2000)
Google Scholar
Mutlu, B., Shiwa, T., Kanda, T., Ishiguro, H., Hagita, N.: Footing in human-robot conversations: how robots might shape participant roles using gaze cues. In: International Conference on Human-Robot Interaction, pp. 61–68 (2009)
Google Scholar
Lewis, D.D.: Naive (bayes) at Forty: The Independence Assumption in Information Retrieval. In: Nédellec, C., Rouveirol, C. (eds.) ECML 1998. LNCS, vol. 1398, pp. 4–15. Springer, Heidelberg (1998)
Chapter Google Scholar

Download references

Author information

Authors and Affiliations

Carnegie Mellon University, 5000 Forbes Ave., Pittsburgh, PA, 15213, USA
Thomas Kollar, Anu Vedantham, Corey Sobel, Cory Chang, Vittorio Perera & Manuela Veloso

Authors

Thomas Kollar
View author publications
You can also search for this author in PubMed Google Scholar
Anu Vedantham
View author publications
You can also search for this author in PubMed Google Scholar
Corey Sobel
View author publications
You can also search for this author in PubMed Google Scholar
Cory Chang
View author publications
You can also search for this author in PubMed Google Scholar
Vittorio Perera
View author publications
You can also search for this author in PubMed Google Scholar
Manuela Veloso
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Interactive Digial Media Institute, Social Robotics Laboratory, National University of Singapore, 4 Engineering Drive 3, 117576, Singapore, Singapore
Shuzhi Sam Ge & John-John Cabibihan &
Artificial Intelligence Laboratory, Department of Computer Science, Stanford University, Gates Building, 94305-9010, Stanford, CA, USA
Oussama Khatib
Robotics Institute, School of Computer Science, Carnegie Mellon University, 5000 Forbes Avenue, 15213, Pittsburgh, PA, USA
Reid Simmons
Faculty of Information Technology, University of Technology, Human-robot Collaboration Studio, 2007, Sydney, NSW, Australia
Mary-Anne Williams

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Kollar, T., Vedantham, A., Sobel, C., Chang, C., Perera, V., Veloso, M. (2012). A Multi-modal Approach for Natural Human-Robot Interaction. In: Ge, S.S., Khatib, O., Cabibihan, JJ., Simmons, R., Williams, MA. (eds) Social Robotics. ICSR 2012. Lecture Notes in Computer Science(), vol 7621. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-34103-8_46

Download citation

DOI: https://doi.org/10.1007/978-3-642-34103-8_46
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-34102-1
Online ISBN: 978-3-642-34103-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

A Multi-modal Approach for Natural Human-Robot Interaction

Abstract

Access this chapter

Subscribe and save

Buy Now

Preview

Similar content being viewed by others

Multimodal Perceptual Cues for Context-Aware Human-Robot Interaction

Voice controlled humanoid robot

The MuMMER Project: Engaging Human-Robot Interaction in Real-World Public Spaces

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

A Multi-modal Approach for Natural Human-Robot Interaction

Abstract

Access this chapter

Subscribe and save

Buy Now

Preview

Similar content being viewed by others

Multimodal Perceptual Cues for Context-Aware Human-Robot Interaction

Voice controlled humanoid robot

The MuMMER Project: Engaging Human-Robot Interaction in Real-World Public Spaces

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation