Skip to main content

Modelling Coordination of Learning Systems: A Reservoir Systems Approach to Dopamine Modulated Pavlovian Conditioning

  • Conference paper
Advances in Artificial Life. Darwin Meets von Neumann (ECAL 2009)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 5777))

Included in the following conference series:

Abstract

This paper presents a biologically constrained reward prediction model capable of learning cue-outcome associations involving temporally distant stimuli without using the commonly used temporal difference model. The model incorporates a novel use of an adapted echo state network to substitute the biologically implausible delay chains usually used, in relation to dopamine phenomena, for tackling temporally structured stimuli. Moreover, the model is based on a novel algorithm which successfully coordinates two sub systems: one providing Pavlovian conditioning, one providing timely inhibition of dopamine responses to salient anticipated stimuli. The model is validated against the typical profile of phasic dopamine in first and second order Pavlovian conditioning. The model is relevant not only to explaining the mechanisms underlying the biological regulation of dopamine signals, but also for applications in autonomous robotics involving reinforcement-based learning.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Schultz, W., Dayan, P., Montague, P.R.: A neural substrate of prediction and reward. Science 275, 1593–1599 (1997)

    Article  Google Scholar 

  2. Rescorla, R.A., Wagner, A.W.: A theory of Pavlovian conditioning: Variations in the effectiveness of reinforcement and nonreinforcement. In: Black, A.H., Prokasy, W.F. (eds.) Classical Conditioning II: Current Research and Theory, pp. 64–99. Appleton-Century-Crofts, New York (1972)

    Google Scholar 

  3. Sutton, R.S., Barto, A.G.: Reinforcement Learning: An Introduction. The MIT Press, Cambridge (1998)

    Google Scholar 

  4. Redgrave, P., Prescott, T.J., Gurney, K.: The basal ganglia: a vertebrate solution to the selection problem? Neuroscience 89, 1009–1023 (1999)

    Article  Google Scholar 

  5. Mannella, F., Mirolli, M., Baldassarre, G.: The role of amygdala in devaluation: a model tested with a simulated robot. In: Berthouze, L., Prince, C.G., Littman, M., Kozima, K., Balkenius, C. (eds.) Proc. 7th Int. Conf. on Epigenetic Robotics, pp. 77–84. Lund University Cognitive Studies (2007)

    Google Scholar 

  6. Alexander, W.H., Sporns, O.: An Embodied Model of Learning Plasticity, and Reward. Adaptive Behavior 3-4, 143–159 (2002)

    Article  Google Scholar 

  7. O’Reilly, R.C., Frank, M.J.: PVLV: The Primary Value and Learned Value Pavlovian Learning Algorithm. Behavioral Neuroscience 121, 31–49 (2007)

    Article  Google Scholar 

  8. Maass, W., Natschlager, T., Markram, H.: Real-time computing without stable states: a new framework for neural computation based on perturbations. Neural Computation 14, 2531–2560 (2002)

    Article  MATH  Google Scholar 

  9. Jaeger, H.: Short term memory in echo state networks. GMD Report 152 (2001)

    Google Scholar 

  10. Buonomano, D.V., Maass, W.: State-dependent computations: spatiotemporal processing in cortical networks. Nat. Rev. Neurosci. 10, 113–125 (2009)

    Article  Google Scholar 

  11. Jaeger, H.: A tutorial on training recurrent neural networks, covering BPPT, RTRL, EKF and the “echo state network” approach. GMD Report 159 (2005)

    Google Scholar 

  12. Hertzberg, J., Jaeger, H., Schoenherr, F.: Learning to Ground Fact Symbols in Behavior-Based Robots. In: van Harmelen, F. (ed.) Proc. 15th European Conf. on Artificial Intelligence, pp. 593–600. IOS Press, Amsterdam (2002)

    Google Scholar 

  13. Suri, R.E.: TD models of reward predictive responses in dopamine neurons. Neural Networks 15, 523–533 (2002)

    Article  Google Scholar 

  14. Ziemke, T., Lowe, R.J.: On the Role of Emotion in Embodied Cognitive Architectures: From Organisms to Robots. Cognitive Computation 1(1), 104–117 (2009)

    Article  Google Scholar 

  15. Lowe, R.J., Humphries, M., Ziemke, T.: The dual-route hypothesis: evaluating a neurocomputational model of fear conditioning in rats. Connection Science 21(1), 15–37 (2009)

    Article  Google Scholar 

  16. Mannella, F., Baldassarre, G.: A Neural-Network Reinforcement-Learning Model of Domestic Chicks that Learn to Localise the Centre of Closed Arenas. Phil. Trans. R. Soc. B. 362, 383–401 (2006)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2011 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Lowe, R., Mannella, F., Ziemke, T., Baldassarre, G. (2011). Modelling Coordination of Learning Systems: A Reservoir Systems Approach to Dopamine Modulated Pavlovian Conditioning. In: Kampis, G., Karsai, I., Szathmáry, E. (eds) Advances in Artificial Life. Darwin Meets von Neumann. ECAL 2009. Lecture Notes in Computer Science(), vol 5777. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-21283-3_51

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-21283-3_51

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-21282-6

  • Online ISBN: 978-3-642-21283-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics