Hostname: page-component-76fb5796d-2lccl Total loading time: 0 Render date: 2024-04-26T07:30:53.032Z Has data issue: false hasContentIssue false

Action learning and grounding in simulated human–robot interactions

Published online by Cambridge University Press:  12 November 2019

Oliver Roesler*
Affiliation:
Artificial Intelligence Lab, Vrije Universiteit Brussel, Pleinlaan 9, 1050 Brussels, Belgium; e-mails: oliver@roesler.co.uk, ann.nowe@vub.ac.be
Ann Nowé
Affiliation:
Artificial Intelligence Lab, Vrije Universiteit Brussel, Pleinlaan 9, 1050 Brussels, Belgium; e-mails: oliver@roesler.co.uk, ann.nowe@vub.ac.be

Abstract

In order to enable robots to interact with humans in a natural way, they need to be able to autonomously learn new tasks. The most natural way for humans to tell another agent, which can be a human or robot, to perform a task is via natural language. Thus, natural human–robot interactions also require robots to understand natural language, i.e. extract the meaning of words and phrases. To do this, words and phrases need to be linked to their corresponding percepts through grounding. Afterward, agents can learn the optimal micro-action patterns to reach the goal states of the desired tasks. Most previous studies investigated only learning of actions or grounding of words, but not both. Additionally, they often used only a small set of tasks as well as very short and unnaturally simplified utterances. In this paper, we introduce a framework that uses reinforcement learning to learn actions for several tasks and cross-situational learning to ground actions, object shapes and colors, and prepositions. The proposed framework is evaluated through a simulated interaction experiment between a human tutor and a robot. The results show that the employed framework can be used for both action learning and grounding.

Type
Adaptive and Learning Agents
Copyright
© Cambridge University Press, 2019 

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

References

Abdo, N., Spinello, L., Burgard, W. & Stachniss, C. 2014. Inferring what to imitate in manipulation actions by using a recommender system. In IEEE International Conference on Robotics and Automation (ICRA), Hong Kong, China.CrossRefGoogle Scholar
Akhtar, N. & Montague, L. 1999. Early lexical acquisition: the role of cross-situational learning. First Language 19(57), 347358.CrossRefGoogle Scholar
Aly, A., Taniguchi, A. & Taniguchi, T. 2017. A generative framework for multimodal learning of spatial concepts and object categories: an unsupervised part-of-speech tagging and 3D visual perception based approach. In IEEE International Conference on Development and Learning and the International Conference on Epigenetic Robotics (ICDL-EpiRob), Lisbon, Portugal, September 2017.Google Scholar
Argall, B. D., Chernova, S., Veloso, M. & Browning, B. 2009. A survey of robot learning from demonstration. Robotics and Autonomous Systems 57, 469483.CrossRefGoogle Scholar
Blythe, R. A., Smith, K. & Smith, A. D. M. 2010. Learning times for large lexicons through cross-situational learning. Cognitive Science 34, 620642.CrossRefGoogle ScholarPubMed
Carey, S. 1978. The child as word-learner. In Linguistic Theory and Psychological Reality , Halle, M., Bresnan, J. & Miller, G. A. (eds). MIT Press, 265293.Google Scholar
Carey, S. & Bartlett, E. 1978. Acquiring a single new word. Papers and Reports on Child Language Development 15, 1729.Google Scholar
Dawson, C. R., Wright, J., Rebguns, A., Escárcega, M. V., Fried, D. & Cohen, P. R. 2013. A generative probabilistic framework for learning spatial language. In IEEE Third Joint International Conference on Development and Learning and Epigenetic Robotics (ICDL), Osaka, Japan, August 2013.Google Scholar
Fisher, C., Hall, D. G., Rakowitz, S. & Gleitman, L. 1994. When it is better to receive than to give: syntactic and conceptual constraints on vocabulary growth. Lingua 92, 333375.CrossRefGoogle Scholar
Flanagan, R., Bowman, M. C. & Johansson, R. S. 2006. Control strategies in object manipulation tasks. Current Opinion in Neurobiology 16, 650659.CrossRefGoogle ScholarPubMed
Fontanari, J. F., Tikhanoff, V., Cangelosi, A., Ilin, R. & Perlovsky, L. I. 2009a. Cross-situational learning of object-word mapping using neural modeling fields. Neural Networks 22(5–6), 579585.CrossRefGoogle ScholarPubMed
Fontanari, J. F., Tikhanoff, V., Cangelosi, A. & Perlovsky, L. I. 2009b. A cross-situational algorithm for learning a lexicon using neural modeling fields. In International Joint Conference on Neural Networks (IJCNN), Atlanta, GA, USA, June 2009.Google Scholar
Gillette, J., Gleitman, H., Gleitman, L. & Lederer, A. 1999. Human simulations of vocabulary learning. Cognition 73, 135176.CrossRefGoogle ScholarPubMed
Gu, S., Holly, E., Lillicrap, T. & Levine, S. 2017. Deep reinforcement learning for robotic manipulation with asynchronous off-policy updates. In IEEE International Conference on Robotics and Automation (ICRA), Singapore, May–June 2017.Google Scholar
Gudimella, A., Story, R., Shaker, M., Kong, R., Brown, M., Shnayder, V. & Campos, M. 2017. Deep reinforcement learning for dexterous manipulation with concept networks. CoRR. https://arxiv.org/abs/1709.06977.Google Scholar
Harnad, S. 1990. The symbol grounding problem. Physica D 42, 335346.CrossRefGoogle Scholar
International Federation of Robotics. 2017. World robotics 2017 - service robots.Google Scholar
Kemp, C. C., Edsinger, A. & Torres-Jara, E. 2007. Challenges for robot manipulation in human environments. IEEE Robotics & Automation Magazine 14(1), 2029.CrossRefGoogle Scholar
Ng, A. Y., Harada, D., & Russell, S. 1999. Policy invariance under reward transformations: theory and application to reward shaping. In Proceedings of the Sixteenth International Conference on Machine Learning (ICML), Bratko, I. & Dzeroski, S. (eds), 99, 278287.Google Scholar
Pinker, S. 1989. Learnability and Cognition. MIT Press.Google Scholar
Popov, I., Heess, N., Lillicrap, T., Hafner, R., Barth-Maron, G., Vecerik, M., Lampe, T., Tassa, Y., Erez, T. & Riedmiller, M. 2017. Data-efficient deep reinforcement learning for dexterous manipulation. CoRR. https://arxiv.org/abs/1704.03073.Google Scholar
Puterman, M. L. 1994. Markov Decision Processes: Discrete Stochastic Dynamic Programming. John Wiley and Sons, Inc.CrossRefGoogle Scholar
Roesler, O., Aly, A., Taniguchi, T. & Hayashi, Y. 2018. A probabilistic framework for comparing syntactic and semantic grounding of synonyms through cross-situational learning. In ICRA ’18 Workshop on Representing a Complex World: Perception, Inference, and Learning for Joint Semantic, Geometric, and Physical Understanding, Brisbane, Australia, May 2018.Google Scholar
Roesler, O., Aly, A., Taniguchi, T. & Hayashi, Y. 2019. Evaluation of word representations in grounding natural language instructions through computational human–robot interaction. In Proceedings of the 14th ACM/IEEE International Conference on Human-Robot Interaction (HRI), Daegu, South Korea, March 2019.Google Scholar
Rusu, R. B., Bradski, G., Thibaux, R. & Hsu, J. 2010. Fast 3D recognition and pose using the viewpoint feature histogram. In Proceedings of the 2010 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Taipei, Taiwan, October 2010, 2155–2162.Google Scholar
She, L., Yang, S., Cheng, Y., Jia, Y., Chai, J. Y. & Xi, N. 2014. Back to the blocks world: learning new actions through situated human-robot dialogue. In Proceedings of the SIGDIAL 2014 Conference, Philadelphia, USA, June 2014, 89–97.Google Scholar
Siskind, J. M. 1996. A computational study of cross-situational techniques for learning word-to-meaning mappings. Cognition 61, 3991.CrossRefGoogle ScholarPubMed
Smith, A. D. M., & Smith, K. 2012. Cross-Situational Learning. Springer US, 864866. ISBN 978-1-4419-1428-6. doi: 10.1007/978-1-4419-1428-6_1712. https://doi.org/10.1007/978-1-4419-1428-6_1712.Google Scholar
Smith, K., Smith, A. D. M. & Blythe, R. A. 2011. Cross-situational learning: an experimental study of word-learning mechanisms. Cognitive Science 35(3), 480498.CrossRefGoogle Scholar
Smith, L. & Yu, C. 2008. Infants rapidly learn word-referent mappings via cross-situational statistics. Cognition 106, 15581568.CrossRefGoogle ScholarPubMed
Steels, L. & Loetzsch, M. 2012. The grounded naming game. In Experiments in Cultural Language Evolution, Steels, L. (ed). John Benjamins, 4159.CrossRefGoogle Scholar
Stulp, F., Theodorou, E. A. & Schaal, S. 2012. Reinforcement learning with sequences of motion primitives for robust manipulation. IEEE Transactions on Robotics (T-RO) 28(6), 13601370.CrossRefGoogle Scholar
Sutton, R. S. & Barto, A. G. 1998. Reinforcement Learning: An Introduction. MIT Press.Google Scholar
Taniguchi, A., Taniguchi, T. & Cangelosi, A. 2017. Cross-situational learning with Bayesian generative models for multimodal category and word learning in robots. Frontiers in Neurorobotics 11.CrossRefGoogle Scholar
Tellex, S., Kollar, T., Dickerson, S., Walter, M. R., Banerjee, A. G., Teller, S. & Roy, N. 2011. Approaching the symbol grounding problem with probabilistic graphical models. AI Magazine 32(4), 6476.CrossRefGoogle Scholar
Vogt, P. 2012. Exploring the robustness of cross-situational learning under Zipfian distributions. Cognitive Science 36(4), 726739.CrossRefGoogle ScholarPubMed