Reinforcement-Driven Shaping of Sequence Learning in Neural Dynamics

Luciw, Matthew; Kazerounian, Sohrob; Sandamirskaya, Yulia; Schöner, Gregor; Schmidhuber, Jürgen

doi:10.1007/978-3-319-08864-8_19

Matthew Luciw²⁴,
Sohrob Kazerounian²⁴,
Yulia Sandamirskaya²⁵,
Gregor Schöner²⁵ &
…
Jürgen Schmidhuber²⁴

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 8575))

Included in the following conference series:

International Conference on Simulation of Adaptive Behavior

1483 Accesses
1 Citations

Abstract

We present here a simulated model of a mobile Kuka Youbot which makes use of Dynamic Field Theory for its underlying perceptual and motor control systems, while learning behavioral sequences through Reinforcement Learning. Although dynamic neural fields have previously been used for robust control in robotics, high-level behavior has generally been pre-programmed by hand. In the present work we extend a recent framework for integrating reinforcement learning and dynamic neural fields, by using the principle of shaping, in order to reduce the search space of the learning agent.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Amari, S.: Dynamics of pattern formation in lateral-inhibition type neural fields. Biological Cybernetics 27, 77–87 (1977)
Article MATH MathSciNet Google Scholar
Asada, M., Noda, S., Tawaratsumida, S., Hosoda, K.: Purposive behavior acquisition for a real robot by vision-based reinforcement learning. In: Recent Advances in Robot Learning, pp. 163–187. Springer (1996)
Google Scholar
Bicho, E., Mallet, P., Schöner, G.: Target representation on an autonomous vehicle with low-level sensors. The International Journal of Robotics Research 19(5), 424–447 (2000)
Article Google Scholar
Colombetti, M., Dorigo, M.: Training agents to perform sequential behavior. Adaptive Behavior 2(3), 247–275 (1994)
Article Google Scholar
Dorigo, M.: Robot shaping: an experiment in behaviour engineering. The MIT Press (1998)
Google Scholar
Duran, B., Sandamirskaya, Y.: Neural dynamics of hierarchically organized sequences: a robotic implementation. In: Proceedings of 2012 IEEE-RAS International Conference on Humanoid Robots, Humanoids (2012)
Google Scholar
Durán, B., Sandamirskaya, Y., Schöner, G.: A dynamic field architecture for the generation of hierarchically organized sequences. In: Villa, A.E.P., Duch, W., Érdi, P., Masulli, F., Palm, G. (eds.) ICANN 2012, Part I. LNCS, vol. 7552, pp. 25–32. Springer, Heidelberg (2012)
Chapter Google Scholar
Frank, M., Leitner, J., Stollenga, M., Förster, A., Schmidhuber, J.: Curiosity driven reinforcement learning for motion planning on humanoids. Frontiers in Neurorobotics 7 (2013)
Google Scholar
Gomez, F., Miikkulainen, R.: 2-D pole-balancing with recurrent evolutionary networks. In: Proceedings of the International Conference on Artificial Neural Networks, pp. 425–430. Citeseer (1998)
Google Scholar
Graziano, V., Gomez, F.J., Ring, M.B., Schmidhuber, J.: T-learning. CoRR abs/1201.0292 (2012)
Google Scholar
Grossberg, S.: Behavioral contrast in short-term memory: Serial binary memory models or parallel continuous memory models? Journal of Mathematical Psychology 3, 199–219 (1978)
Article Google Scholar
Grossberg, S., Kazerounian, S.: Laminar cortical dynamics of conscious speech perception: Neural model of phonemic restoration using subsequent context in noise. The Journal of the Acoustical Society of America 130(1), 440–460 (2011)
Article Google Scholar
Gullapalli, V.: Reinforcement learning and its application to control. PhD thesis, Citeseer (1992)
Google Scholar
Indiveri, G.: Swedish wheeled omnidirectional mobile robots: kinematics analysis and control. IEEE Transactions on Robotics 25(1), 164–171 (2009)
Article Google Scholar
James, M.R., Singh, S.: Sarsalandmark: an algorithm for learning in pomdps with landmarks. In: Proceedings of The 8th International Conference on Autonomous Agents and Multiagent Systems-Volume 1, pp. 585–591. International Foundation for Autonomous Agents and Multiagent Systems (2009)
Google Scholar
Kaelbing, L.P., Littman, M.L., Moore, A.W.: Reinforcement learning: A survey. Journal of Artificial Intelligence Research 4, 237–285 (1996)
Google Scholar
Kazerounian, S., Luciw, M., Richter, M., Sandamirskaya, Y.: Autonomous reinforcement of behavioral sequences in neural dynamics. In: International Joint Conference on Neural Networks, IJCNN (2013)
Google Scholar
Konidaris, G., Barto, A.: Autonomous shaping: Knowledge transfer in reinforcement learning. In: Proceedings of the 23rd international conference on Machine learning, pp. 489–496. ACM (2006)
Google Scholar
Loch, J., Singh, S.: Using eligibility traces to find the best memoryless policy in partially observable markov decision processes. In: Proceedings of the Fifteenth International Conference on Machine Learning. Citeseer (1998)
Google Scholar
Mataric, M.J.: Reward functions for accelerated learning. ICML 94, 181–189 (1994)
Google Scholar
McGovern, A., Sutton, R.S., Fagg, A.H.: Roles of macro-actions in accelerating reinforcement learning. In: Grace Hopper celebration of women in computing, vol. 1317 (1997)
Google Scholar
Peterson, G.B.: A day of great illumination: Bf skinner’s discovery of shaping. Journal of the Experimental Analysis of Behavior 82(3), 317–328 (2004)
Article Google Scholar
Piaget, J.: The origins of intelligence in children. International Universities Press, New York (1952)
Google Scholar
Randlov, J., Alstrom, P.: Learning to drive a bicycle using reinforcement learning and shaping. In: Proceedings of the Fifteenth International Conference on Machine Learning, pp. 463–471 (1998)
Google Scholar
Richter, M., Sandamirskaya, Y., Schöner, G.: A robotic architecture for action selection and behavioral organization inspired by human cognition. In: IEEE/RSJ International Conference on Intelligent Robots and Systems, IROS (2012)
Google Scholar
Rummery, G., Niranjan, M.: On-line Q-learning using connectionist systems. University of Cambridge, Department of Engineering (1994)
Google Scholar
Sandamirskaya, Y., Richter, M., Schöner, G.: A neural-dynamic architecture for behavioral organization of an embodied agent. In: IEEE International Conference on Development and Learning and on Epigenetic Robotics, ICDL EPIROB 2011 (2011)
Google Scholar
Sandamirskaya, Y., Schöner, G.: Dynamic field theory of sequential action: A model and its implementation on an embodied agent. In: Scassellati, B., Deak, G. (eds.) International Conference on Development and Learning ICDL 2008, paper 53, 8 pages (2008)
Google Scholar
Sandamirskaya, Y., Schöner, G.: An embodied account of serial order: How instabilities drive sequence generation. Neural Networks 23(10), 1164–1179 (2010)
Article Google Scholar
Sasksida, L.M., Raymond, S.M., Touretzky, D.S.: Shaping robot behavior using principles from instrumental conditioning. Robotics and Autonomous Systems 22(3), 231–249 (1998)
Google Scholar
Schmidhuber, J.: Curious model-building control systems. In: Proceedings of the International Joint Conference on Neural Networks, Singapore. Volume 2, pp. 1458–1463. IEEE Press (1991)
Google Scholar
Schöner, G.: Dynamical systems approaches to neural systems and behavior. In: Smelser, N.J., Baltes, P.B. (eds.) International Encyclopedia of the Social & Behavioral Sciences, Oxford, Pergamon, pp. 10571–10575. Pergamon Press, Oxford (2002)
Google Scholar
Selfridge, O.G., Sutton, R.S., Barto, A.G.: Training and tracking in robotics. In: IJCAI, pp. 670–672. Citeseer (1985)
Google Scholar
Silver, M.R., Grossberg, S., Bullock, D., Histed, M.H., Miller, E.K.: A neural model of sequential movement planning and control of eye movements: Item-order-rank working memory and saccade selection by the supplementary eye fields. Neural Networks 26, 29–58 (2012)
Article Google Scholar
Skinner, B.F.: The behavior of organisms: An experimental analysis (1938)
Google Scholar
Spong, M.W., Hutchinson, S., Vidyasagar, M.: Robot modeling and control. John Wiley & Sons, New York (2006)
Google Scholar
Sutton, R., Barto, A.: Reinforcement learning: An introduction, vol. 1. Cambridge Univ. Press (1998)
Google Scholar
Thrun, S.B.: The role of exploration in learning control. In: Handbook of Intelligent Control: Neural, Fuzzy, and Adaptive Approaches. Van Nostrand Reinhold, New York (1992)
Google Scholar
Touretzky, D.S., Saksida, L.M.: Operant conditioning in skinnerbots. Adaptive Behavior 5(3-4), 219–247 (1997)
Article Google Scholar
Webots: Commercial Mobile Robot Simulation Software, http://www.cyberbotics.com
Weng, J.: Developmental robotics: Theory and experiments. International Journal of Humanoid Robotics 1(02), 199–236 (2004)
Article Google Scholar

Download references

Author information

Authors and Affiliations

Istituto Dalle Molle di Studi sull’Intelligenza Artificiale (IDSIA), Manno-Lugano, Switzerland
Matthew Luciw, Sohrob Kazerounian & Jürgen Schmidhuber
Institut für Neuroinformatik at the Universitätstr, Bochum, Germany
Yulia Sandamirskaya & Gregor Schöner

Authors

Matthew Luciw
View author publications
You can also search for this author in PubMed Google Scholar
Sohrob Kazerounian
View author publications
You can also search for this author in PubMed Google Scholar
Yulia Sandamirskaya
View author publications
You can also search for this author in PubMed Google Scholar
Gregor Schöner
View author publications
You can also search for this author in PubMed Google Scholar
Jürgen Schmidhuber
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Robotic Intelligence Lavoratory, Jaume I University, Avda. Sos Baynat s/n, 12071, Castellón de la Plana, Spain
Angel P. del Pobil
School of Computing, University of Leeds, LS2 9JT, Leeds, UK
Eris Chinellato
Robotic Intelligence Laboratory, Jaume I University, Avda. Sos Baynat s/n, 12071, Castellón de la Plana, Spain
Ester Martinez-Martin & Enric Cervera &
Mærsk McKinney Møller Institute, University of Southern Denmark, Campusvej 55, 5230, Odense, Denmark
John Hallam
Robotic Intelligence Laboratory, Jaume I University, Avda. Sos Baynat s/n, 12071, Castellón de la Plana, spain
Antonio Morales

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Luciw, M., Kazerounian, S., Sandamirskaya, Y., Schöner, G., Schmidhuber, J. (2014). Reinforcement-Driven Shaping of Sequence Learning in Neural Dynamics. In: del Pobil, A.P., Chinellato, E., Martinez-Martin, E., Hallam, J., Cervera, E., Morales, A. (eds) From Animals to Animats 13. SAB 2014. Lecture Notes in Computer Science(), vol 8575. Springer, Cham. https://doi.org/10.1007/978-3-319-08864-8_19

Download citation

DOI: https://doi.org/10.1007/978-3-319-08864-8_19
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-08863-1
Online ISBN: 978-3-319-08864-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics