Skip to main content

Reinforcement-Driven Shaping of Sequence Learning in Neural Dynamics

  • Conference paper
From Animals to Animats 13 (SAB 2014)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 8575))

Included in the following conference series:

Abstract

We present here a simulated model of a mobile Kuka Youbot which makes use of Dynamic Field Theory for its underlying perceptual and motor control systems, while learning behavioral sequences through Reinforcement Learning. Although dynamic neural fields have previously been used for robust control in robotics, high-level behavior has generally been pre-programmed by hand. In the present work we extend a recent framework for integrating reinforcement learning and dynamic neural fields, by using the principle of shaping, in order to reduce the search space of the learning agent.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Amari, S.: Dynamics of pattern formation in lateral-inhibition type neural fields. Biological Cybernetics 27, 77–87 (1977)

    Article  MATH  MathSciNet  Google Scholar 

  2. Asada, M., Noda, S., Tawaratsumida, S., Hosoda, K.: Purposive behavior acquisition for a real robot by vision-based reinforcement learning. In: Recent Advances in Robot Learning, pp. 163–187. Springer (1996)

    Google Scholar 

  3. Bicho, E., Mallet, P., Schöner, G.: Target representation on an autonomous vehicle with low-level sensors. The International Journal of Robotics Research 19(5), 424–447 (2000)

    Article  Google Scholar 

  4. Colombetti, M., Dorigo, M.: Training agents to perform sequential behavior. Adaptive Behavior 2(3), 247–275 (1994)

    Article  Google Scholar 

  5. Dorigo, M.: Robot shaping: an experiment in behaviour engineering. The MIT Press (1998)

    Google Scholar 

  6. Duran, B., Sandamirskaya, Y.: Neural dynamics of hierarchically organized sequences: a robotic implementation. In: Proceedings of 2012 IEEE-RAS International Conference on Humanoid Robots, Humanoids (2012)

    Google Scholar 

  7. Durán, B., Sandamirskaya, Y., Schöner, G.: A dynamic field architecture for the generation of hierarchically organized sequences. In: Villa, A.E.P., Duch, W., Érdi, P., Masulli, F., Palm, G. (eds.) ICANN 2012, Part I. LNCS, vol. 7552, pp. 25–32. Springer, Heidelberg (2012)

    Chapter  Google Scholar 

  8. Frank, M., Leitner, J., Stollenga, M., Förster, A., Schmidhuber, J.: Curiosity driven reinforcement learning for motion planning on humanoids. Frontiers in Neurorobotics 7 (2013)

    Google Scholar 

  9. Gomez, F., Miikkulainen, R.: 2-D pole-balancing with recurrent evolutionary networks. In: Proceedings of the International Conference on Artificial Neural Networks, pp. 425–430. Citeseer (1998)

    Google Scholar 

  10. Graziano, V., Gomez, F.J., Ring, M.B., Schmidhuber, J.: T-learning. CoRR abs/1201.0292 (2012)

    Google Scholar 

  11. Grossberg, S.: Behavioral contrast in short-term memory: Serial binary memory models or parallel continuous memory models? Journal of Mathematical Psychology 3, 199–219 (1978)

    Article  Google Scholar 

  12. Grossberg, S., Kazerounian, S.: Laminar cortical dynamics of conscious speech perception: Neural model of phonemic restoration using subsequent context in noise. The Journal of the Acoustical Society of America 130(1), 440–460 (2011)

    Article  Google Scholar 

  13. Gullapalli, V.: Reinforcement learning and its application to control. PhD thesis, Citeseer (1992)

    Google Scholar 

  14. Indiveri, G.: Swedish wheeled omnidirectional mobile robots: kinematics analysis and control. IEEE Transactions on Robotics 25(1), 164–171 (2009)

    Article  Google Scholar 

  15. James, M.R., Singh, S.: Sarsalandmark: an algorithm for learning in pomdps with landmarks. In: Proceedings of The 8th International Conference on Autonomous Agents and Multiagent Systems-Volume 1, pp. 585–591. International Foundation for Autonomous Agents and Multiagent Systems (2009)

    Google Scholar 

  16. Kaelbing, L.P., Littman, M.L., Moore, A.W.: Reinforcement learning: A survey. Journal of Artificial Intelligence Research 4, 237–285 (1996)

    Google Scholar 

  17. Kazerounian, S., Luciw, M., Richter, M., Sandamirskaya, Y.: Autonomous reinforcement of behavioral sequences in neural dynamics. In: International Joint Conference on Neural Networks, IJCNN (2013)

    Google Scholar 

  18. Konidaris, G., Barto, A.: Autonomous shaping: Knowledge transfer in reinforcement learning. In: Proceedings of the 23rd international conference on Machine learning, pp. 489–496. ACM (2006)

    Google Scholar 

  19. Loch, J., Singh, S.: Using eligibility traces to find the best memoryless policy in partially observable markov decision processes. In: Proceedings of the Fifteenth International Conference on Machine Learning. Citeseer (1998)

    Google Scholar 

  20. Mataric, M.J.: Reward functions for accelerated learning. ICML 94, 181–189 (1994)

    Google Scholar 

  21. McGovern, A., Sutton, R.S., Fagg, A.H.: Roles of macro-actions in accelerating reinforcement learning. In: Grace Hopper celebration of women in computing, vol. 1317 (1997)

    Google Scholar 

  22. Peterson, G.B.: A day of great illumination: Bf skinner’s discovery of shaping. Journal of the Experimental Analysis of Behavior 82(3), 317–328 (2004)

    Article  Google Scholar 

  23. Piaget, J.: The origins of intelligence in children. International Universities Press, New York (1952)

    Google Scholar 

  24. Randlov, J., Alstrom, P.: Learning to drive a bicycle using reinforcement learning and shaping. In: Proceedings of the Fifteenth International Conference on Machine Learning, pp. 463–471 (1998)

    Google Scholar 

  25. Richter, M., Sandamirskaya, Y., Schöner, G.: A robotic architecture for action selection and behavioral organization inspired by human cognition. In: IEEE/RSJ International Conference on Intelligent Robots and Systems, IROS (2012)

    Google Scholar 

  26. Rummery, G., Niranjan, M.: On-line Q-learning using connectionist systems. University of Cambridge, Department of Engineering (1994)

    Google Scholar 

  27. Sandamirskaya, Y., Richter, M., Schöner, G.: A neural-dynamic architecture for behavioral organization of an embodied agent. In: IEEE International Conference on Development and Learning and on Epigenetic Robotics, ICDL EPIROB 2011 (2011)

    Google Scholar 

  28. Sandamirskaya, Y., Schöner, G.: Dynamic field theory of sequential action: A model and its implementation on an embodied agent. In: Scassellati, B., Deak, G. (eds.) International Conference on Development and Learning ICDL 2008, paper 53, 8 pages (2008)

    Google Scholar 

  29. Sandamirskaya, Y., Schöner, G.: An embodied account of serial order: How instabilities drive sequence generation. Neural Networks 23(10), 1164–1179 (2010)

    Article  Google Scholar 

  30. Sasksida, L.M., Raymond, S.M., Touretzky, D.S.: Shaping robot behavior using principles from instrumental conditioning. Robotics and Autonomous Systems 22(3), 231–249 (1998)

    Google Scholar 

  31. Schmidhuber, J.: Curious model-building control systems. In: Proceedings of the International Joint Conference on Neural Networks, Singapore. Volume 2, pp. 1458–1463. IEEE Press (1991)

    Google Scholar 

  32. Schöner, G.: Dynamical systems approaches to neural systems and behavior. In: Smelser, N.J., Baltes, P.B. (eds.) International Encyclopedia of the Social & Behavioral Sciences, Oxford, Pergamon, pp. 10571–10575. Pergamon Press, Oxford (2002)

    Google Scholar 

  33. Selfridge, O.G., Sutton, R.S., Barto, A.G.: Training and tracking in robotics. In: IJCAI, pp. 670–672. Citeseer (1985)

    Google Scholar 

  34. Silver, M.R., Grossberg, S., Bullock, D., Histed, M.H., Miller, E.K.: A neural model of sequential movement planning and control of eye movements: Item-order-rank working memory and saccade selection by the supplementary eye fields. Neural Networks 26, 29–58 (2012)

    Article  Google Scholar 

  35. Skinner, B.F.: The behavior of organisms: An experimental analysis (1938)

    Google Scholar 

  36. Spong, M.W., Hutchinson, S., Vidyasagar, M.: Robot modeling and control. John Wiley & Sons, New York (2006)

    Google Scholar 

  37. Sutton, R., Barto, A.: Reinforcement learning: An introduction, vol. 1. Cambridge Univ. Press (1998)

    Google Scholar 

  38. Thrun, S.B.: The role of exploration in learning control. In: Handbook of Intelligent Control: Neural, Fuzzy, and Adaptive Approaches. Van Nostrand Reinhold, New York (1992)

    Google Scholar 

  39. Touretzky, D.S., Saksida, L.M.: Operant conditioning in skinnerbots. Adaptive Behavior 5(3-4), 219–247 (1997)

    Article  Google Scholar 

  40. Webots: Commercial Mobile Robot Simulation Software, http://www.cyberbotics.com

  41. Weng, J.: Developmental robotics: Theory and experiments. International Journal of Humanoid Robotics 1(02), 199–236 (2004)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2014 Springer International Publishing Switzerland

About this paper

Cite this paper

Luciw, M., Kazerounian, S., Sandamirskaya, Y., Schöner, G., Schmidhuber, J. (2014). Reinforcement-Driven Shaping of Sequence Learning in Neural Dynamics. In: del Pobil, A.P., Chinellato, E., Martinez-Martin, E., Hallam, J., Cervera, E., Morales, A. (eds) From Animals to Animats 13. SAB 2014. Lecture Notes in Computer Science(), vol 8575. Springer, Cham. https://doi.org/10.1007/978-3-319-08864-8_19

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-08864-8_19

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-08863-1

  • Online ISBN: 978-3-319-08864-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics