skip to main content
10.1145/3171221.3171289acmconferencesArticle/Chapter ViewAbstractPublication PageshriConference Proceedingsconference-collections
research-article
Public Access

Deep Reinforcement Learning of Abstract Reasoning from Demonstrations

Published: 26 February 2018 Publication History

Abstract

Extracting a set of generalizable rules that govern the dynamics of complex, high-level interactions between humans based only on observations is a high-level cognitive ability. Mastery of this skill marks a significant milestone in the human developmental process. A key challenge in designing such an ability in autonomous robots is discovering the relationships among discriminatory features. Identifying features in natural scenes that are representative of a particular event or interaction (i.e. »discriminatory features») and then discovering the relationships (e.g., temporal/spatial/spatio-temporal/causal) among those features in the form of generalized rules are non-trivial problems. They often appear as a »chicken-and-egg» dilemma. This paper proposes an end-to-end learning framework to tackle these two problems in the context of learning generalized, high-level rules of human interactions from structured demonstrations. We employed our proposed deep reinforcement learning framework to learn a set of rules that govern a behavioral intervention session between two agents based on observations of several instances of the session. We also tested the accuracy of our framework with human subjects in diverse situations.

References

[1]
Jake K Aggarwal and Michael S Ryoo . 2011. Human activity analysis: A review. ACM Computing Surveys (CSUR) Vol. 43, 3 (2011), 16.
[2]
S Reza Ahmadzadeh, Roshni Kaushik, and Sonia Chernova . 2016. Trajectory learning from demonstration with canal surfaces: A parameter-free approach Humanoid Robots (Humanoids), 2016 IEEE-RAS 16th International Conference on. IEEE, 544--549.
[3]
Brenna D Argall, Sonia Chernova, Manuela Veloso, and Brett Browning . 2009. A survey of robot learning from demonstration. Robotics and autonomous systems Vol. 57, 5 (2009), 469--483.
[4]
Donald M Baer, Montrose M Wolf, and Todd R Risley . 1987. Some still-current dimensions of applied behavior analysis. Journal of Applied Behavior Analysis Vol. 20, 4 (1987), 313--327.
[5]
Momotaz Begum, Richard W Serna, David Kontak, Jordan Allspaw, James Kuczynski, Holly A Yanco, and Jacob Suarez . 2015. Measuring the Efficacy of Robots in Autism Therapy: How Informative are Standard HRI Metrics' Proceedings of the Tenth Annual ACM/IEEE International Conference on Human-Robot Interaction. ACM, 335--342.
[6]
Momotaz Begum, Richard W Serna, and Holly A Yanco . 2016. Are robots ready to deliver autism interventions? a comprehensive review. International Journal of Social Robotics Vol. 8, 2 (2016), 157--181.
[7]
Aude Billard, Sylvain Calinon, Ruediger Dillmann, and Stefan Schaal . 2008. Robot programming by demonstration. Springer handbook of robotics. Springer, 1371--1394.
[8]
Kalesha Bullard, Baris Akgun, Sonia Chernova, and Andrea L Thomaz . 2016. Grounding action parameters from demonstration. In Robot and Human Interactive Communication (RO-MAN), 2016 25th IEEE International Symposium on. IEEE, 253--260.
[9]
Sonia Chernova and Andrea L Thomaz . 2014. Robot learning from human teachers. Synthesis Lectures on Artificial Intelligence and Machine Learning, Vol. 8, 3 (2014), 1--121.
[10]
Madison Clark-Turner . 2017. Deep Reinforcement Abstract LfD. (2017). https://github.com/AssistiveRoboticsUNH/deep_reinforcement_abstract_lfd
[11]
Madison Clark-Turner and Momotaz Begum . 2017. Deep Recurrent Q-Learning of Behavioral Intervention Delivery by a Robot from Demonstration Data. In Robot and Human Interactive Communication (RO-MAN), 2016 25th IEEE International Symposium on. IEEE, 1024--1029.
[12]
Richard Cubek, Wolfgang Ertel, and Günther Palm . 2015. High-level learning from demonstration with conceptual spaces and subspace clustering Robotics and Automation (ICRA), 2015 IEEE International Conference on. IEEE, 2592--2597.
[13]
Jeffrey Donahue, Lisa Anne Hendricks, Sergio Guadarrama, Marcus Rohrbach, Subhashini Venugopalan, Kate Saenko, and Trevor Darrell . 2015. Long-term recurrent convolutional networks for visual recognition and description Proceedings of the IEEE conference on computer vision and pattern recognition. 2625--2634.
[14]
Staffan Ekvall and Danica Kragic . 2008. Robot learning from demonstration: a task-level planning approach. International Journal of Advanced Robotic Systems, Vol. 5, 3 (2008), 33.
[15]
Vlad Firoiu, William F Whitney, and Joshua B Tenenbaum . 2017. Beating the World's Best at Super Smash Bros. with Deep Reinforcement Learning. arXiv preprint arXiv:1702.06230 (2017).
[16]
Richard M Foxx . 2008. Applied behavior analysis treatment of autism: The state of the art. Child and adolescent psychiatric clinics of North America, Vol. 17, 4 (2008), 821--834.
[17]
Stevan Harnad . 1990. The symbol grounding problem. Physica D: Nonlinear Phenomena Vol. 42, 1--3 (1990), 335--346.
[18]
Matthew Hausknecht and Peter Stone . 2015. Deep recurrent q-learning for partially observable mdps. arXiv preprint arXiv:1507.06527 (2015).
[19]
John L Horn and Raymond B Cattell . 1966. Refinement and test of the theory of fluid and crystallized general intelligences. Journal of educational psychology Vol. 57, 5 (1966), 253.
[20]
Ashesh Jain, Amir R Zamir, Silvio Savarese, and Ashutosh Saxena . 2016. Structural-RNN: Deep learning on spatio-temporal graphs Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 5308--5317.
[21]
Andrej Karpathy, George Toderici, Sanketh Shetty, Thomas Leung, Rahul Sukthankar, and Li Fei-Fei . 2014. Large-scale video classification with convolutional neural networks Proceedings of the IEEE conference on Computer Vision and Pattern Recognition. 1725--1732.
[22]
Johannes Kulick, Marc Toussaint, Tobias Lang, and Manuel Lopes . 2013. Active Learning for Teaching a Robot Grounded Relational Symbols. IJCAI. 1451--1457.
[23]
Guillaume Lample and Devendra Singh Chaplot . 2017. Playing FPS Games with Deep Reinforcement Learning. AAAI. 2140--2146.
[24]
Yann LeCun, Yoshua Bengio, and Geoffrey Hinton . 2015. Deep learning. Nature, Vol. 521, 7553 (2015), 436--444.
[25]
Volodymyr Mnih, Koray Kavukcuoglu, David Silver, Andrei A Rusu, Joel Veness, Marc G Bellemare, Alex Graves, Martin Riedmiller, Andreas K Fidjeland, Georg Ostrovski, et almbox. . 2015. Human-level control through deep reinforcement learning. Nature, Vol. 518, 7540 (2015), 529--533.
[26]
Scott Niekum, Sarah Osentoski, George Konidaris, Sachin Chitta, Bhaskara Marthi, and Andrew G Barto . 2015. Learning grounded finite-state representations from unstructured demonstrations. The International Journal of Robotics Research, Vol. 34, 2 (2015), 131--157.
[27]
Sinno Jialin Pan and Qiang Yang . 2010. A survey on transfer learning. IEEE Transactions on knowledge and data engineering, Vol. 22, 10 (2010), 1345--1359.
[28]
Chris Paxton, Felix Jonathan, Marin Kobilarov, and Gregory D Hager . 2016. Do what i want, not what i did: Imitation of skills by planning sequences of actions Intelligent Robots and Systems (IROS), 2016 IEEE/RSJ International Conference on. IEEE, 3778--3785.
[29]
Sebastian Ruder . 2016. An overview of gradient descent optimization algorithmst. (2016). http://sebastianruder.com/optimizing-gradient-descent/index.html#gradientdescentvariants
[30]
Pierre Sermanet, Kelvin Xu, and Sergey Levine . 2016. Unsupervised perceptual rewards for imitation learning. arXiv preprint arXiv:1612.06699 (2016).
[31]
Karen Simonyan and Andrew Zisserman . 2014. Two-stream convolutional networks for action recognition in videos Advances in neural information processing systems. 568--576.
[32]
Bharat Singh, Tim K Marks, Michael Jones, Oncel Tuzel, and Ming Shao . 2016. A multi-stream bi-directional recurrent neural network for fine-grained action detection Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 1961--1970.
[33]
Martin Sundermeyer, Ralf Schlüter, and Hermann Ney . 2012. LSTM Neural Networks for Language Modeling. In Interspeech. 194--197.
[34]
Christian Szegedy, Sergey Ioffe, Vincent Vanhoucke, and Alexander A Alemi . 2017. Inception-v4, Inception-ResNet and the Impact of Residual Connections on Learning. AAAI. 4278--4284.
[35]
Li-Chia Yang, Szu-Yu Chou, Jen-Yu Liu, Yi-Hsuan Yang, and Yi-An Chen . 2017. Revisiting the problem of audio-based hit song prediction using convolutional neural networks. arXiv preprint arXiv:1704.01280 (2017).
[36]
Joe Yue-Hei Ng, Matthew Hausknecht, Sudheendra Vijayanarasimhan, Oriol Vinyals, Rajat Monga, and George Toderici . 2015. Beyond short snippets: Deep networks for video classification Proceedings of the IEEE conference on computer vision and pattern recognition. 4694--4702.

Cited By

View all
  • (2023)A Penetration Method for UAV Based on Distributed Reinforcement Learning and DemonstrationsDrones10.3390/drones70402327:4(232)Online publication date: 27-Mar-2023
  • (2022)Deep Q-network for social robotics using emotional social signalsFrontiers in Robotics and AI10.3389/frobt.2022.8805479Online publication date: 26-Sep-2022
  • (2022)Learning Turn-Taking Behavior from Human Demonstrations for Social Human-Robot Interactions2022 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)10.1109/IROS47612.2022.9981243(7643-7649)Online publication date: 23-Oct-2022
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
HRI '18: Proceedings of the 2018 ACM/IEEE International Conference on Human-Robot Interaction
February 2018
468 pages
ISBN:9781450349536
DOI:10.1145/3171221
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 26 February 2018

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. abstract reasoning
  2. deep learning
  3. learning from demonstration

Qualifiers

  • Research-article

Funding Sources

Conference

HRI '18
Sponsor:

Acceptance Rates

HRI '18 Paper Acceptance Rate 49 of 206 submissions, 24%;
Overall Acceptance Rate 268 of 1,124 submissions, 24%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)140
  • Downloads (Last 6 weeks)12
Reflects downloads up to 03 Mar 2025

Other Metrics

Citations

Cited By

View all
  • (2023)A Penetration Method for UAV Based on Distributed Reinforcement Learning and DemonstrationsDrones10.3390/drones70402327:4(232)Online publication date: 27-Mar-2023
  • (2022)Deep Q-network for social robotics using emotional social signalsFrontiers in Robotics and AI10.3389/frobt.2022.8805479Online publication date: 26-Sep-2022
  • (2022)Learning Turn-Taking Behavior from Human Demonstrations for Social Human-Robot Interactions2022 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)10.1109/IROS47612.2022.9981243(7643-7649)Online publication date: 23-Oct-2022
  • (2022)A survey on deep reinforcement learning for audio-based applicationsArtificial Intelligence Review10.1007/s10462-022-10224-256:3(2193-2240)Online publication date: 2-Jul-2022
  • (2021)In-the-Wild Learning from Demonstration for Therapies for Autism Spectrum Disorder2021 30th IEEE International Conference on Robot & Human Interactive Communication (RO-MAN)10.1109/RO-MAN50785.2021.9515439(1224-1229)Online publication date: 8-Aug-2021
  • (2019)Leveraging Temporal Reasoning for Policy Selection in Learning from Demonstration2019 International Conference on Robotics and Automation (ICRA)10.1109/ICRA.2019.8794461(7798-7804)Online publication date: May-2019

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Login options

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media