Skip to main content

Emergence of Safe Behaviours with an Intrinsic Reward

  • Conference paper
  • 1730 Accesses

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 6943))

Abstract

This paper explores the idea that robots can learn safe behaviors without prior knowledge about its environment nor the task at hand, using intrinsic motivation to reverse actions. Our general idea is that if the robot learns to reverse its actions, all the behaviors that emerge from this principle are intrinsically safe. We validate this idea with experiments to benchmark the performance of obstacle avoidance behavior. We compare our algorithm based on an abstract intrinsic reward with a Q-learning algorithm for obstacle avoidance based on external reward signal. Finally, we demonstrate that safety of learning can be increased further by first training the robot in the simulator using the intrinsic reward and then running the test with the real robot in the real environment.

The experimental results show that the performance of the proposed algorithm is on average only 5-10% lower than of the Q-Learning algorithm. A physical robot, using the knowledge obtained in simulation, in real world performs 10% worse than in simulation. However, its performance reaches the same success rate with the physically trained robot after a short learning period. We interpret this as the evidence confirming the hypothesis that our learning algorithm can be used to teach safe behaviors to a robot.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Ryan, R.M., Deci, E.L.: Intrinsic and extrinsic motivations: classic definitions and new directions. Contemporary Educational Psychology 25(1), 54–67 (2000)

    Article  Google Scholar 

  2. Prescott, T.J., Montes Gonzalez, F.M., Gurney, K., Humphries, M.D., Redgrav, P.: A robot model of the basal ganglia: Behavior and intrinsic processing. Neural Networks 19(1), 31–61 (2006)

    Google Scholar 

  3. Schmidhuber, J.: Exploring the predictable. In: Ghosh, A., Tsutsui, S. (eds.) Advances in Evolutionary Computing, pp. 579–612. Springer, Heidelberg (2003)

    Chapter  Google Scholar 

  4. Schmidhuber, J.: Self-Motivated Development Through Rewards for Predictor Errors / Improvements. In: 2005 AAAI Spring Symposium on Developmental Robotics, pp. 1994–1996 (2005)

    Google Scholar 

  5. Barto, A.G., Singh, S., Chentanez, N.: Intrinsically Motivated Learning of Hierarchical Collections of Skills. In: ICDL 2004, pp. 112–119 (2004)

    Google Scholar 

  6. Stout, A., Konidaris, G.D., Barto, A.G.: Intrinsically Motivated Reinforcement Learning-A Promising Framework For Developmental Robot Learning. In: The AAAI Spring Symposium on Developmental Robotics (2005)

    Google Scholar 

  7. Kaplan, F., Oudeyer, P.Y.: Motivational principles for visual know-how development. In: 3rd International Workshop on Epigenetic Robotics, pp. 73–80 (2003)

    Google Scholar 

  8. Oudeyer, P.Y., Kaplan, F.: Intrinsic Motivation Systems for Autonomous Mental Development. IEEE Trans. Evol. Comput. 11, 265–286 (2007)

    Article  Google Scholar 

  9. Oudeyer, P.Y., Kaplan, F.: What is intrinsic motivation? A topology of computational approaches. In: Front. Neurorobotics, vol. 1 (2007)

    Google Scholar 

  10. Breazeal, C.: Designing Sociable Robots. Bradford Books/MIT Press, Cambridge (2002)

    MATH  Google Scholar 

  11. Kruusmaa, M., Gavshin, Y., Eppendahl, A.: Don’t Do Things You Can’t Undo: Reversibility Models for Generating Safe Behaviours. In: ICRA 2007, pp. 1134–1139 (2007)

    Google Scholar 

  12. Gavshin, Y., Kruusmaa, M.: Comparative experiments on the emergence of safe behaviours. In: TAROS 2008, pp. 65–70 (2008)

    Google Scholar 

  13. Gerkey, B., Vaughan, R., Howard, A.: The player/stage project: Tools for multi-robot and distributed sensor systems. In: ICAR 2003, pp. 317–323 (2003)

    Google Scholar 

  14. Lin, M., Zhu, J., Sun, Z.: Learning Obstacle Avoidance Behavior Using Multi-agent Learning with Fuzzy States. In: Bussler, C.J., Fensel, D. (eds.) AIMSA 2004. LNCS (LNAI), vol. 3192, pp. 389–398. Springer, Heidelberg (2004)

    Chapter  Google Scholar 

  15. Gutnisky, D.A., Zanutto, B.S.: Learning Obstacle Avoidance with an Operant Behavior Model. Artificial Life 10(1), 65–81 (2004)

    Article  Google Scholar 

  16. Macek, K., Petrovic, I., Peric, N.: A Reinforcement Learning Approach to Obstacle Avoidance of Mobile Robot. In: IEEE AMC 2002, pp. 462–466 (2002)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2011 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Gavshin, Y., Kruusmaa, M. (2011). Emergence of Safe Behaviours with an Intrinsic Reward. In: Bouchachia, A. (eds) Adaptive and Intelligent Systems. ICAIS 2011. Lecture Notes in Computer Science(), vol 6943. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-23857-4_20

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-23857-4_20

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-23856-7

  • Online ISBN: 978-3-642-23857-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics