Emergence of Safe Behaviours with an Intrinsic Reward

Gavshin, Yuri; Kruusmaa, Maarja

doi:10.1007/978-3-642-23857-4_20

Emergence of Safe Behaviours with an Intrinsic Reward

Yuri Gavshin²⁰ &
Maarja Kruusmaa²⁰

Conference paper

1730 Accesses

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 6943))

Abstract

This paper explores the idea that robots can learn safe behaviors without prior knowledge about its environment nor the task at hand, using intrinsic motivation to reverse actions. Our general idea is that if the robot learns to reverse its actions, all the behaviors that emerge from this principle are intrinsically safe. We validate this idea with experiments to benchmark the performance of obstacle avoidance behavior. We compare our algorithm based on an abstract intrinsic reward with a Q-learning algorithm for obstacle avoidance based on external reward signal. Finally, we demonstrate that safety of learning can be increased further by first training the robot in the simulator using the intrinsic reward and then running the test with the real robot in the real environment.

The experimental results show that the performance of the proposed algorithm is on average only 5-10% lower than of the Q-Learning algorithm. A physical robot, using the knowledge obtained in simulation, in real world performs 10% worse than in simulation. However, its performance reaches the same success rate with the physically trained robot after a short learning period. We interpret this as the evidence confirming the hypothesis that our learning algorithm can be used to teach safe behaviors to a robot.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Ryan, R.M., Deci, E.L.: Intrinsic and extrinsic motivations: classic definitions and new directions. Contemporary Educational Psychology 25(1), 54–67 (2000)
Article Google Scholar
Prescott, T.J., Montes Gonzalez, F.M., Gurney, K., Humphries, M.D., Redgrav, P.: A robot model of the basal ganglia: Behavior and intrinsic processing. Neural Networks 19(1), 31–61 (2006)
Google Scholar
Schmidhuber, J.: Exploring the predictable. In: Ghosh, A., Tsutsui, S. (eds.) Advances in Evolutionary Computing, pp. 579–612. Springer, Heidelberg (2003)
Chapter Google Scholar
Schmidhuber, J.: Self-Motivated Development Through Rewards for Predictor Errors / Improvements. In: 2005 AAAI Spring Symposium on Developmental Robotics, pp. 1994–1996 (2005)
Google Scholar
Barto, A.G., Singh, S., Chentanez, N.: Intrinsically Motivated Learning of Hierarchical Collections of Skills. In: ICDL 2004, pp. 112–119 (2004)
Google Scholar
Stout, A., Konidaris, G.D., Barto, A.G.: Intrinsically Motivated Reinforcement Learning-A Promising Framework For Developmental Robot Learning. In: The AAAI Spring Symposium on Developmental Robotics (2005)
Google Scholar
Kaplan, F., Oudeyer, P.Y.: Motivational principles for visual know-how development. In: 3rd International Workshop on Epigenetic Robotics, pp. 73–80 (2003)
Google Scholar
Oudeyer, P.Y., Kaplan, F.: Intrinsic Motivation Systems for Autonomous Mental Development. IEEE Trans. Evol. Comput. 11, 265–286 (2007)
Article Google Scholar
Oudeyer, P.Y., Kaplan, F.: What is intrinsic motivation? A topology of computational approaches. In: Front. Neurorobotics, vol. 1 (2007)
Google Scholar
Breazeal, C.: Designing Sociable Robots. Bradford Books/MIT Press, Cambridge (2002)
MATH Google Scholar
Kruusmaa, M., Gavshin, Y., Eppendahl, A.: Don’t Do Things You Can’t Undo: Reversibility Models for Generating Safe Behaviours. In: ICRA 2007, pp. 1134–1139 (2007)
Google Scholar
Gavshin, Y., Kruusmaa, M.: Comparative experiments on the emergence of safe behaviours. In: TAROS 2008, pp. 65–70 (2008)
Google Scholar
Gerkey, B., Vaughan, R., Howard, A.: The player/stage project: Tools for multi-robot and distributed sensor systems. In: ICAR 2003, pp. 317–323 (2003)
Google Scholar
Lin, M., Zhu, J., Sun, Z.: Learning Obstacle Avoidance Behavior Using Multi-agent Learning with Fuzzy States. In: Bussler, C.J., Fensel, D. (eds.) AIMSA 2004. LNCS (LNAI), vol. 3192, pp. 389–398. Springer, Heidelberg (2004)
Chapter Google Scholar
Gutnisky, D.A., Zanutto, B.S.: Learning Obstacle Avoidance with an Operant Behavior Model. Artificial Life 10(1), 65–81 (2004)
Article Google Scholar
Macek, K., Petrovic, I., Peric, N.: A Reinforcement Learning Approach to Obstacle Avoidance of Mobile Robot. In: IEEE AMC 2002, pp. 462–466 (2002)
Google Scholar

Download references

Author information

Authors and Affiliations

Centre for Biorobotics, Tallinn University of Technology, Akadeemia tee 15a-111, 12618, Tallinn, Estonia
Yuri Gavshin & Maarja Kruusmaa

Authors

Yuri Gavshin
View author publications
You can also search for this author in PubMed Google Scholar
Maarja Kruusmaa
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Informatics-Systems, University of Klagenfurt, Universitätsstr. 65-67, 9020, Klagenfurt, Austria
Abdelhamid Bouchachia

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Gavshin, Y., Kruusmaa, M. (2011). Emergence of Safe Behaviours with an Intrinsic Reward. In: Bouchachia, A. (eds) Adaptive and Intelligent Systems. ICAIS 2011. Lecture Notes in Computer Science(), vol 6943. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-23857-4_20

Download citation

DOI: https://doi.org/10.1007/978-3-642-23857-4_20
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-23856-7
Online ISBN: 978-3-642-23857-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics