Efficient behavior learning in human–robot collaboration

Munzer, Thibaut; Toussaint, Marc; Lopes, Manuel

doi:10.1007/s10514-017-9674-5

Efficient behavior learning in human–robot collaboration

Published: 25 November 2017

Volume 42, pages 1103–1115, (2018)
Cite this article

Autonomous Robots Aims and scope Submit manuscript

2076 Accesses
18 Citations
Explore all metrics

Abstract

We present a novel method for a robot to interactively learn, while executing, a joint human–robot task. We consider collaborative tasks realized by a team of a human operator and a robot helper that adapts to the human’s task execution preferences. Different human operators can have different abilities, experiences, and personal preferences so that a particular allocation of activities in the team is preferred over another. Our main goal is to have the robot learn the task and the preferences of the user to provide a more efficient and acceptable joint task execution. We cast concurrent multi-agent collaboration as a semi-Markov decision process and show how to model the team behavior and learn the expected robot behavior. We further propose an interactive learning framework and we evaluate it both in simulation and on a real robotic setup to show the system can effectively learn and adapt to human expectations.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Challenges of real-world reinforcement learning: definitions, benchmarks and analysis

Article 22 April 2021

Game-theoretic multi-agent motion planning in a mixed environment

Article 15 March 2024

A review of motion planning algorithms for intelligent robots

Article Open access 25 November 2021

Notes

We sometimes use the term policy to refer to a deterministic mapping from S to A.

References

Akrour, R., Schoenauer, M., & Sebag, M. (2011). Preference-based policy learning. In ECML/PKDD Springer.
Blockeel, H., & De Raedt, L. (1998). Top-down induction of first-order logical decision trees. Artificial Intelligence, 101(1), 285–297.
Article MathSciNet MATH Google Scholar
Chernova, S., & Veloso, M. (2009). Interactive policy learning through confidence-based autonomy. Journal of Artificial Intelligence Research, 34(1), 1.
MathSciNet MATH Google Scholar
Džeroski, S., De Raedt, L., & Driessens, K. (2001). Relational reinforcement learning. Machine Learning, 43(1–2), 7–52.
Article MATH Google Scholar
Friedman, J. H. (2001). Greedy function approximation: A gradient boosting machine. In Annals of statistics (pp. 1189–1232).
Grollman, D. H, & Jenkins, O. C. (2007) Dogged learning for robots. In ICRA.
Jain, A., Wojcik, B., Joachims, T., & Saxena, A. (2013). Learning trajectory preferences for manipulators via iterative improvement. In NIPS.
Kersting, K., Otterlo, M. V., & De Raedt, L. (2004). Bellman goes relational. In ICML.
Knox, W. B., Stone, P., & Breazeal, C. (2013). Training a robot via human feedback: A case study. In ICSR.
Koppula, H. S., Jain, A., & Saxena, A. (2016). Anticipatory planning for human–robot teams. In ISER.
Lang, T., & Toussaint, M. (2010). Planning with noisy probabilistic relational rules. Journal of Artificial Intelligence Research, 39(1), 1–49.
MATH Google Scholar
Lee, M. K., Forlizzi, J., Kiesler, S., Rybski, P., Antanitis, J., & Savetsila, S. (2012). Personalization in HRI: A longitudinal field experiment. In HRI.
Lopes, M., Melo, F., & Montesano, L. (2009). Active learning for reward estimation in inverse reinforcement learning. In ECML/PKDD.
Marek, V., & Truszczyński, W. (1999) Stable models and an alternative logic programming paradigm. In The logic programming paradigm: A 25-year perspective.
Mason, M., & Lopes, M. (2011) Robot self-initiative and personalization by learning through repeated interactions. In HRI.
Mitsunaga, N., Smith, C., Kanda, T., Ishiguro, H., & Hagita, N. (2008). Adapting robot behavior for human–robot interaction. Transactions on Robotics, 24(4), 911–916.
Article Google Scholar
Munzer, T., Piot, B., Geist, M., Pietquin, O., & Lopes, M. (2015). Inverse reinforcement learning in relational domains. In IJCAI.
Natarajan, S., Joshi, S., Tadepalli, P., Kersting, K., & Shavlik, J. (2011). Imitation learning in relational domains: A functional-gradient boosting approach. In IJCAI.
Nikolaidis, S., & Shah, J. (2013). Human–robot cross-training: Computational formulation, modeling and evaluation of a human team training strategy. In HRI.
Nikolaidis, S., Gu, K., Ramakrishnan, R., & Shah, J. (2014). Efficient model learning for human–robot collaborative tasks. arXiv preprint arXiv:1405.6341.
Rohanimanesh, K., & Mahadevan, S. (2005) Coarticulation: An approach for generating concurrent plans in markov decision processes. In ICML.
Shivaswamy, P. K., & Joachims, T. (2012). Online structured prediction via coactive learning. In ICML.
Toussaint, M., Munzer, T., Mollard, Y., Wu, L. Y., Vien, N. A., & Lopes, M. (2016). Relational activity processes for modeling concurrent cooperation. In ICRA.

Download references

Acknowledgements

This work was supported by national funds through Fundação para a Ciência e a Tecnologia (FCT) with reference UID/CEC/50021/2013 and by the EU FP7-ICT project 3rdHand under Grant Agreement No. 610878.

Author information

Authors and Affiliations

Inria, Bordeaux, France
Thibaut Munzer
Machine Learning and Robotics Lab, University of Stuttgart, Stuttgart, Germany
Marc Toussaint
INESC-ID, Lisboa, Portugal
Manuel Lopes
Instituto Superior Tecnico, Lisboa, Portugal
Manuel Lopes

Authors

Thibaut Munzer
View author publications
You can also search for this author in PubMed Google Scholar
Marc Toussaint
View author publications
You can also search for this author in PubMed Google Scholar
Manuel Lopes
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Manuel Lopes.

Additional information

This is one of the several papers published in Autonomous Robots comprising the Special Issue on Learning for Human-Robot Collaboration.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Munzer, T., Toussaint, M. & Lopes, M. Efficient behavior learning in human–robot collaboration. Auton Robot 42, 1103–1115 (2018). https://doi.org/10.1007/s10514-017-9674-5

Download citation

Received: 23 December 2016
Accepted: 17 October 2017
Published: 25 November 2017
Issue Date: June 2018
DOI: https://doi.org/10.1007/s10514-017-9674-5

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Efficient behavior learning in human–robot collaboration

Abstract

Access this article

Similar content being viewed by others

Challenges of real-world reinforcement learning: definitions, benchmarks and analysis

Game-theoretic multi-agent motion planning in a mixed environment

A review of motion planning algorithms for intelligent robots

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Efficient behavior learning in human–robot collaboration

Abstract

Access this article

Similar content being viewed by others

Challenges of real-world reinforcement learning: definitions, benchmarks and analysis

Game-theoretic multi-agent motion planning in a mixed environment

A review of motion planning algorithms for intelligent robots

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation