Abstract
Extending previous work, we introduce a spiking actor-critic network model of learning from reward and punishment in the basal ganglia. In the model, the striatum is taken to be segregated into populations of medium spiny neurons (MSNs) that carry either D1 or D2 dopamine receptor type. This segregation allows explicit representation of both positive and negative expected outcome within the respective population. In line with recent experiments, we further assume that D1 and D2 MSN populations have opposing dopamine-modulated bidirectional synaptic plasticity. Experiments were conducted in a grid world, where a moving agent had to reach a remote rewarded goal state. The network learned not only to approach the rewarded goal, but also to consequently avoid punishments as opposed to the previous model. The spiking network model explains functional role of D1/D2 MSN segregation within striatum, specifically the reversed direction of dopamine-dependent plasticity found at synapses converging on different MSNs.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Schultz, W., Dayan, P., Montague, P.R.: A neural substrate of prediction and reward. Science 275(5306), 1593–1599 (1997)
Gurney, K., Prescott, T.J., Wickens, J.R., Redgrave, P.: Computational models of the basal ganglia: from robots to membranes. Trends Neurosci. 27(8), 453–459 (2004)
Doya, K.: Reinforcement learning: Computational theory and biological mechanisms. HFSP J 1(1), 30–40 (2007)
Frank, M.J.: Computational models of motivated action selection in corticostriatal circuits. Curr. Opin. Neurobiol. (April 2011)
Potjans, W., Diesmann, M., Morrison, A.: An imperfect dopaminergic error signal can drive temporal-difference learning. PLoS Comput. Biol. 7(5), e1001133 (2011)
Sutton, R.S., Barto, A.G.: Reinforcement learning: an introduction. IEEE Trans. Neural Netw. 9(5), 1054 (1998)
Montague, P.R., Hyman, S.E., Cohen, J.D.: Computational roles for dopamine in behavioural control. Nature 431(7010), 760–767 (2004)
Schultz, W.: Behavioral dopamine signals. Trends Neurosci. 30(5), 203–210 (2007)
Kreitzer, A.C., Malenka, R.C.: Striatal plasticity and basal ganglia circuit function. Neuron 60(4), 543–554 (2008)
Wickens, J.R.: Synaptic plasticity in the basal ganglia. Behav. Brain Res. 199(1), 119–128 (2009)
Sesack, S.R., Grace, A.A.: Cortico-basal ganglia reward network: microcircuitry. Neuropsychopharmacology 35(1), 27–47 (2010)
Shen, W., Flajolet, M., Greengard, P., James Surmeier, D.: Dichotomous dopaminergic control of striatal synaptic plasticity. Science 321(5890), 848–851 (2008)
Gerstner, W., Kistler, W.M.: Spiking Neuron Models. Cambridge University Press (2002) ISBN 0521890799
Gewaltig, M.-O., Diesmann, M.: NEST (NEural Simulation Tool). Scholarpedia 2(4), 1430 (2007)
Morrison, A., Diesmann, M., Gerstner, W.: Phenomenological models of synaptic plasticity based on spike timing. Biol. Cybern. 98(6), 459–478 (2008)
Clopath, C., Büsing, L., Vasilaki, E., Gerstner, W.: Connectivity reflects coding: a model of voltage-based stdp with homeostasis. Nat. Neurosci. 13(3), 344–352 (2010)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2012 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Jitsev, J., Abraham, N., Morrison, A., Tittgemeyer, M. (2012). Learning from Delayed Reward und Punishment in a Spiking Neural Network Model of Basal Ganglia with Opposing D1/D2 Plasticity. In: Villa, A.E.P., Duch, W., Érdi, P., Masulli, F., Palm, G. (eds) Artificial Neural Networks and Machine Learning – ICANN 2012. ICANN 2012. Lecture Notes in Computer Science, vol 7552. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-33269-2_58
Download citation
DOI: https://doi.org/10.1007/978-3-642-33269-2_58
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-33268-5
Online ISBN: 978-3-642-33269-2
eBook Packages: Computer ScienceComputer Science (R0)