Learning from Delayed Reward und Punishment in a Spiking Neural Network Model of Basal Ganglia with Opposing D1/D2 Plasticity

Jitsev, Jenia; Abraham, Nobi; Morrison, Abigail; Tittgemeyer, Marc

doi:10.1007/978-3-642-33269-2_58

Jenia Jitsev²¹,
Nobi Abraham²¹,
Abigail Morrison²² &
…
Marc Tittgemeyer²¹

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 7552))

Included in the following conference series:

International Conference on Artificial Neural Networks

Abstract

Extending previous work, we introduce a spiking actor-critic network model of learning from reward and punishment in the basal ganglia. In the model, the striatum is taken to be segregated into populations of medium spiny neurons (MSNs) that carry either D1 or D2 dopamine receptor type. This segregation allows explicit representation of both positive and negative expected outcome within the respective population. In line with recent experiments, we further assume that D1 and D2 MSN populations have opposing dopamine-modulated bidirectional synaptic plasticity. Experiments were conducted in a grid world, where a moving agent had to reach a remote rewarded goal state. The network learned not only to approach the rewarded goal, but also to consequently avoid punishments as opposed to the previous model. The spiking network model explains functional role of D1/D2 MSN segregation within striatum, specifically the reversed direction of dopamine-dependent plasticity found at synapses converging on different MSNs.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Synchronization and Exploration in Basal Ganglia—A Spiking Network Model

Dynamics of Reward Based Decision Making: A Computational Study

Dissociable roles of ventral pallidum neurons in the basal ganglia reinforcement learning network

Article 30 March 2020

References

Schultz, W., Dayan, P., Montague, P.R.: A neural substrate of prediction and reward. Science 275(5306), 1593–1599 (1997)
Article Google Scholar
Gurney, K., Prescott, T.J., Wickens, J.R., Redgrave, P.: Computational models of the basal ganglia: from robots to membranes. Trends Neurosci. 27(8), 453–459 (2004)
Article Google Scholar
Doya, K.: Reinforcement learning: Computational theory and biological mechanisms. HFSP J 1(1), 30–40 (2007)
Article Google Scholar
Frank, M.J.: Computational models of motivated action selection in corticostriatal circuits. Curr. Opin. Neurobiol. (April 2011)
Google Scholar
Potjans, W., Diesmann, M., Morrison, A.: An imperfect dopaminergic error signal can drive temporal-difference learning. PLoS Comput. Biol. 7(5), e1001133 (2011)
Google Scholar
Sutton, R.S., Barto, A.G.: Reinforcement learning: an introduction. IEEE Trans. Neural Netw. 9(5), 1054 (1998)
Article Google Scholar
Montague, P.R., Hyman, S.E., Cohen, J.D.: Computational roles for dopamine in behavioural control. Nature 431(7010), 760–767 (2004)
Article Google Scholar
Schultz, W.: Behavioral dopamine signals. Trends Neurosci. 30(5), 203–210 (2007)
Article Google Scholar
Kreitzer, A.C., Malenka, R.C.: Striatal plasticity and basal ganglia circuit function. Neuron 60(4), 543–554 (2008)
Article Google Scholar
Wickens, J.R.: Synaptic plasticity in the basal ganglia. Behav. Brain Res. 199(1), 119–128 (2009)
Article Google Scholar
Sesack, S.R., Grace, A.A.: Cortico-basal ganglia reward network: microcircuitry. Neuropsychopharmacology 35(1), 27–47 (2010)
Article Google Scholar
Shen, W., Flajolet, M., Greengard, P., James Surmeier, D.: Dichotomous dopaminergic control of striatal synaptic plasticity. Science 321(5890), 848–851 (2008)
Article Google Scholar
Gerstner, W., Kistler, W.M.: Spiking Neuron Models. Cambridge University Press (2002) ISBN 0521890799
Google Scholar
Gewaltig, M.-O., Diesmann, M.: NEST (NEural Simulation Tool). Scholarpedia 2(4), 1430 (2007)
Article Google Scholar
Morrison, A., Diesmann, M., Gerstner, W.: Phenomenological models of synaptic plasticity based on spike timing. Biol. Cybern. 98(6), 459–478 (2008)
Article MathSciNet MATH Google Scholar
Clopath, C., Büsing, L., Vasilaki, E., Gerstner, W.: Connectivity reflects coding: a model of voltage-based stdp with homeostasis. Nat. Neurosci. 13(3), 344–352 (2010)
Article Google Scholar

Download references

Author information

Authors and Affiliations

Cortical Networks and Cognitive Functions Group, Max-Planck-Institute for Neurological Research, 50931, Cologne, Germany
Jenia Jitsev, Nobi Abraham & Marc Tittgemeyer
Functional Neural Circuits Group, Bernstein Center Freiburg, Albert-Ludwig University of Freiburg, 79104, Freiburg, Germany
Abigail Morrison

Authors

Jenia Jitsev
View author publications
You can also search for this author in PubMed Google Scholar
Nobi Abraham
View author publications
You can also search for this author in PubMed Google Scholar
Abigail Morrison
View author publications
You can also search for this author in PubMed Google Scholar
Marc Tittgemeyer
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Neuro Heuristic Research Group, University of Lausanne, 1015, Lausanne, Switzerland
Alessandro E. P. Villa
Department of Informatics, Nicolaus Copernicus University, 87-100, Toruń, Poland
Włodzisław Duch
Center for Complex Systems Studies, Kalamazoo College, 49006, Kalamazoo, MI, USA
Péter Érdi
Dipartimento di Informatica e Scienze dell’Informazione, Università di Genova, 16146, Genoa, Italy
Francesco Masulli
Institut für Neuroinformatik, Universität Ulm, 89069, Ulm, Germany
Günther Palm

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Jitsev, J., Abraham, N., Morrison, A., Tittgemeyer, M. (2012). Learning from Delayed Reward und Punishment in a Spiking Neural Network Model of Basal Ganglia with Opposing D1/D2 Plasticity. In: Villa, A.E.P., Duch, W., Érdi, P., Masulli, F., Palm, G. (eds) Artificial Neural Networks and Machine Learning – ICANN 2012. ICANN 2012. Lecture Notes in Computer Science, vol 7552. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-33269-2_58

Download citation

DOI: https://doi.org/10.1007/978-3-642-33269-2_58
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-33268-5
Online ISBN: 978-3-642-33269-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics