abstract

Multimodal Representation Learning for Human Robot Interaction

Authors:

Eli Sheppard,

Katrin S. LohanAuthors Info & Claims

HRI '20: Companion of the 2020 ACM/IEEE International Conference on Human-Robot Interaction

Pages 445 - 446

https://doi.org/10.1145/3371382.3378265

Published: 01 April 2020 Publication History

Get Access

Abstract

We present a neural network based system capable of learning a multimodal representation of images and words. This representation allows for bidirectional grounding of the meaning of words and the visual attributes that they represent, such as colour, size and object name. We also present a new dataset captured specifically for this task.

References

[1]

Broz, F., Nehaniv, C. L., Belpaeme, T., Bisio, A., Dautenhahn, K., Fadiga, L., Ferrauto, T., Fischer, K., Förster, F., Gigliotta, O., et al. The italk project: A developmental robotics approach to the study of individual, social, and linguistic learning. Topics in cognitive science 6, 3 (2014), 534--544.

Google Scholar

[2]

Cangelosi, A., Belpaeme, T., Sandini, G., Metta, G., Fadiga, L., Sagerer, G., Rohlfing, K.,Wrede, B., Nolfi, S., Parisi, D., et al. The italk project: Integration and transfer of action and language knowledge in robots. In Proceedings of Third ACM/IEEE International Conference on Human Robot Interaction (HRI 2008) (2008), vol. 12, p. 15.

Google Scholar

[3]

Keller, I., and Lohan, K. S. Analysis of illumination robustness in long-term object learning. In 2016 25th IEEE International Symposium on Robot and Human Interactive Communication (RO-MAN) (2016), IEEE, pp. 240--245.

Digital Library

Google Scholar

[4]

Keller, I., and Lohan, K. S. On the Illumination Influence for Object Learning on Robot Companions. Frontiers in Robotics and AI,(in press) (2019), 1--17.

Google Scholar

[5]

Mikolov, T., Sutskever, I., Chen, K., Corrado, G. S., and Dean, J. Distributed representations of words and phrases and their compositionality. In Advances in neural information processing systems (2013), pp. 3111--3119.

Digital Library

Google Scholar

[6]

Ngiam, J., Khosla, A., Kim, M., Nam, J., Lee, H., and Ng, A. Y. Multimodal deep learning. In Proceedings of the 28th international conference on machine learning (ICML-11) (2011), pp. 689--696.

Digital Library

Google Scholar

[7]

Schillingmann, L., Wrede, B., and Rohlfing, K. Towards a computational model of acoustic packaging. In Development and Learning, 2009. ICDL 2009. IEEE 8th International Conference on (2009), IEEE, pp. 1--6.

Digital Library

Google Scholar

[8]

Schillingmann, L., Wrede, B., and Rohlfing, K. J. A computational model of acoustic packaging. IEEE Transactions on Autonomous Mental Development 1, 4 (2009), 226--237.

Digital Library

Google Scholar

[9]

Sheppard, E., Lehmann, H., Rajendran, G., McKenna, P. E., Lemon, O., and Lohan, K. S. Towards life long learning: Multimodal learning of mnist handwritten digits. IEEE ICDL EPIROB 2018 Workshop on Life Long Learning (2018).

Google Scholar

[10]

Silberer, C., and Lapata, M. Learning grounded meaning representations with autoencoders. In Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) (2014), vol. 1, pp. 721--732.

Crossref

Google Scholar

[11]

Srivastava, N., Hinton, G., Krizhevsky, A., Sutskever, I., and Salakhutdinov, R. Dropout: a simple way to prevent neural networks from overfitting. The journal of machine learning research 15, 1 (2014), 1929--1958.

Digital Library

Google Scholar

Cited By

View all

Kovalev AShaban MOsipov EPanov A(2022)Vector Semiotic Model for Visual Question AnsweringCognitive Systems Research10.1016/j.cogsys.2021.09.00171:C(52-63)Online publication date: 1-Jan-2022
https://dl.acm.org/doi/10.1016/j.cogsys.2021.09.001
Song KWang NZhang Y(2020)An Improved Deep Canonical Correlation Fusion Method for Underwater Multisource DataIEEE Access10.1109/ACCESS.2020.30144958(146300-146307)Online publication date: 2020
https://doi.org/10.1109/ACCESS.2020.3014495
Yasar MIslam MIqbal T(undefined)IMPRINT: Interactional Dynamics-aware Motion Prediction in Teams using Multimodal ContextACM Transactions on Human-Robot Interaction10.1145/3626954
https://dl.acm.org/doi/10.1145/3626954

Index Terms

Multimodal Representation Learning for Human Robot Interaction
1. Computing methodologies
  1. Artificial intelligence
  2. Machine learning
    1. Machine learning approaches
      1. Neural networks

Recommendations

Representation Learning: A Review and New Perspectives

The success of machine learning algorithms generally depends on data representation, and we hypothesize that this is because different representations can entangle and hide more or less the different explanatory factors of variation behind the data. ...
Supervised autonomy for online learning in human-robot interaction

When a robot is learning it needs to explore its environment and how its environment responds on its actions. When the environment is large and there are a large number of possible actions the robot can take, this exploration phase can take ...
Knowledge acquisition through human---robot multimodal interaction

The limited understanding of the surrounding environment still restricts the capabilities of robotic systems in real world applications. Specifically, the acquisition of knowledge about the environment typically relies only on perception, which requires ...

Comments

Information & Contributors

Information

Published In

HRI '20: Companion of the 2020 ACM/IEEE International Conference on Human-Robot Interaction

March 2020

702 pages

ISBN:9781450370578

DOI:10.1145/3371382

General Chairs:
Tony Belpaeme
Ghent University, Belgium
,
James Young
University of Manitoba, Canada
,
Program Chairs:
Hatice Gunes
University of Cambridge, UK
,
Laurel Riek
UC San Diego, USA

Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the Owner/Author.

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 01 April 2020

Check for updates

Author Tags

Qualifiers

Abstract

Funding Sources

Engineering and Physical Sciences Research Council

Conference

HRI '20

Sponsor:

HRI '20: ACM/IEEE International Conference on Human-Robot Interaction

March 23 - 26, 2020

Cambridge, United Kingdom

Acceptance Rates

Overall Acceptance Rate 192 of 519 submissions, 37%

Upcoming Conference

HRI '25

Sponsor:
sigai
sigai

ACM/IEEE International Conference on Human-Robot Interaction

March 4 - 6, 2025

Melbourne , VIC , Australia

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

3
Total Citations
View Citations
167
Total Downloads

Downloads (Last 12 months)9
Downloads (Last 6 weeks)1

Reflects downloads up to 16 Feb 2025

Other Metrics

View Author Metrics

Citations

Cited By

View all

Kovalev AShaban MOsipov EPanov A(2022)Vector Semiotic Model for Visual Question AnsweringCognitive Systems Research10.1016/j.cogsys.2021.09.00171:C(52-63)Online publication date: 1-Jan-2022
https://dl.acm.org/doi/10.1016/j.cogsys.2021.09.001
Song KWang NZhang Y(2020)An Improved Deep Canonical Correlation Fusion Method for Underwater Multisource DataIEEE Access10.1109/ACCESS.2020.30144958(146300-146307)Online publication date: 2020
https://doi.org/10.1109/ACCESS.2020.3014495
Yasar MIslam MIqbal T(undefined)IMPRINT: Interactional Dynamics-aware Motion Prediction in Teams using Multimodal ContextACM Transactions on Human-Robot Interaction10.1145/3626954
https://dl.acm.org/doi/10.1145/3626954

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Abstract

References

Cited By

Index Terms

Recommendations

Representation Learning: A Review and New Perspectives

Supervised autonomy for online learning in human-robot interaction

Knowledge acquisition through human---robot multimodal interaction

Comments

Information

Published In

Sponsors

Publisher

Publication History

Check for updates

Author Tags

Qualifiers

Funding Sources

Conference

Acceptance Rates

Upcoming Conference

Contributors

Other Metrics

Bibliometrics

Article Metrics

Other Metrics

Citations

Cited By

Login options

Full Access

View options

PDF

eReader

Share

Share this Publication link

Share on social media

Affiliations