Skip to main content
Log in

A Predictive Reinforcement Model of Dopamine Neurons for Learning Approach Behavior

  • Published:
Journal of Computational Neuroscience Aims and scope Submit manuscript

Abstract

A neural network model of how dopamine and prefrontal cortex activity guides short- and long-term information processing within the cortico-striatal circuits during reward-related learning of approach behavior is proposed. The model predicts two types of reward-related neuronal responses generated during learning: (1) cell activity signaling errors in the prediction of the expected time of reward delivery and (2) neural activations coding for errors in the prediction of the amount and type of reward or stimulus expectancies. The former type of signal is consistent with the responses of dopaminergic neurons, while the latter signal is consistent with reward expectancy responses reported in the prefrontal cortex. It is shown that a neural network architecture that satisfies the design principles of the adaptive resonance theory of Carpenter and Grossberg (1987) can account for the dopamine responses to novelty, generalization, and discrimination of appetitive and aversive stimuli. These hypotheses are scrutinized via simulations of the model in relation to the delivery of free food outside a task, the timed contingent delivery of appetitive and aversive stimuli, and an asymmetric, instructed delay response task.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Similar content being viewed by others

References

  • Alexander GE, DeLong MR, Strick PL (1986) Parallel organization of functionally segregated circuits linking basal ganglia and cortex. Ann. Rev. Neurosci. 9:357–381.

    Google Scholar 

  • Apicella P, Ljungberg T, Scarnati E, Schultz W (1991) Responses to reward in monkey dorsal and ventral striatum. Exp. Brain Res. 85:491–500.

    Google Scholar 

  • Apicella P, Scarnati E, Ljungberg T, Schultz W(1992) Neuronal activity in monkey striatum related to the expectation of predictable environmental events. J. Neurophysiol. 68:945–960.

    Google Scholar 

  • Bowman EM, Aigner TG, Richmond BJ (1996) Neural signals in the monkey ventral striatum related to motivation for juice and cocaine rewards. J. Neurophysiol. 75:1061–1073.

    Google Scholar 

  • Brown RG, Marsden CD (1990) Cognitive function in Parkinson' disease: From description to theory. Trends in Neurosci. 13:21–29.

    Google Scholar 

  • Brown VJ, Schwarz U, Bowman EM, Fuhr P, Robinson DL, Hallet M (1993) Dopamine dependent reaction time deficits with Parkinson' disease are task specific. Neuropsychologia 31:459–469.

    Google Scholar 

  • Buonomano DV, Mauk MD (1994) Neural network model of the cerebellum: Temporal discrimination and the timing of motor responses. Neural Computation 6:38–55.

    Google Scholar 

  • Calabresi P, Maj R, Pisani A, Mercuri NB, Bernardi G (1992) Longterm synaptic depression in the striatum: Physiological and pharmacological characterization. J. Neurosci. 12:4224–4233.

    Google Scholar 

  • Canavan AGM, Passingham RE, Marsden CD, Quinn N, Wyke M, Polkey CE (1989) The performance on learning tasks of patients in the early stages of Parkinson' disease. Neuropsychologia 27:141–156.

    Google Scholar 

  • Carpenter GA (1997) Distributed learning, recognition, and prediction by ART and ARTMAP neural networks. Neural Networks 10:1473–1494.

    Google Scholar 

  • Carpenter GA, Grossberg S (1987) ART 2: Self-organization of stable category recognition codes for analog input patterns. Applied Optics 26:4919–4930.

    Google Scholar 

  • Carpenter GA, Grossberg S (1990) ART-3: Hierarchical search using chemical transmitters in self-organizing pattern recognition architectures. Neural Networks 3:129–152.

    Google Scholar 

  • Contreras-Vidal JL, Schultz W (1996) A neural network model of reward-related learning, motivation and orienting behavior (Abstract). Soc. Neurosci. Abs. 22:2029.

    Google Scholar 

  • Contreras-Vidal JL, Stelmach GE (1995) A neural model of basal ganglia-thalamocortical relations in normal and parkinsonian movement. Biol. Cybern. 73:467–476.

    Google Scholar 

  • Eblen F, Graybiel AM (1995) Highly restricted origin of prefrontal cortical inputs to striosomes in the macaque monkey. J. Neurosci. 15:5999–6013.

    Google Scholar 

  • Fiala JC, Grossberg S, Bullock D (1996) Metabotropic glutamate receptor activation in cerebellar purkinje cells as substrate for adaptive timing of the classically conditioned eye-blink response. J. Neurosci. 16:3760–3774.

    Google Scholar 

  • Gaffan D, Murray EA, Fabre-Thorpe M (1993) Interaction of the amygdala with the frontal lobe in reward memory. Eur. J. Neurosci. 5:968–975.

    Google Scholar 

  • Gariano RF, Groves PM (1988) Burst firing induced in midbrain dopamine neurons by stimulation of the medial prefrontal and anterior cingulate cortices. Brain Res. 462:194–198.

    Google Scholar 

  • Gaspar P, Stepniewska I, Kaas J (1992) Topography and collateralization of the dopaminergic projections to motor and lateral prefrontal cortex in Owl monkeys. J. Comp. Neurol. 325:1–21.

    Google Scholar 

  • Gerfen CR (1989) The neostriatal mosaic: Striatal patch-matrix organization is related to cortical lamination. Science 246:385–388.

    Google Scholar 

  • Gerfen CR (1992) The neostriatal mosaic: Multiple levels of compartmental organization in the basal ganglia. Annu. Rev. Neurosci. 15:285–320.

    Google Scholar 

  • Gerfen CR, Herkenham M, Thibault J (1987) The neostriatal mosaic. II. Patch-and matrix-directed mesostriatal dopaminergic and nondopaminergic systems. J. Neurosci. 7:3935–3944.

    Google Scholar 

  • Goldman-Rakic PS, Porrino LJ (1985) The primate mediodorsal (MD) nucleus and its projection to the frontal lobe. J. Comp. Neurol. 242:535–560.

    Google Scholar 

  • Gotham AM, Brown RG, Marsden CD (1988) ”Frontal” cognitive function in patients with Parkinson' disease “on” and “off” levodopa. Brain 111:299–321.

    Google Scholar 

  • Grace AA, Bunney BS (1985) Opposing effects of striatonigral feedback pathways on midbrain dopamine cell activity. Brain Res. 333:271–284.

    Google Scholar 

  • Graveland GA, DiFiglia M (1985) The frequency and distribution of medium-sized neurons with indented nuclei in the primate and rodent neostriatum. Brain Res 327:307–311.

    Google Scholar 

  • Graybiel AM (1990) Neurotransmitters and neuromodulators in the basal ganglia. Trends in Neurosci. 13:244–254.

    Google Scholar 

  • Groenewegen HJ (1988) Organization of the afferent connections of the mediodorsal thalamic nucleus in the rat, related to the mediodorsal-prefrontal topography. Neuroscience 24:379–431.

    Google Scholar 

  • Grossberg S, Merrill JWL (1992) A neural network model of adaptively timed reinforcement learning and hippocampal dynamics. Cogn. Brain Res. 1:3–38.

    Google Scholar 

  • Haber SN, Lynd E, Klein C, Groenewegen HJ (1990) Topographic organization of the ventral striatal efferent projections in the rhesus monkey: An anterograde tracing study. J. Comp. Neurol. 293:282–298.

    Google Scholar 

  • Hodgkin AL, Huxley AF (1952) A quatitative description of membrane current and its application to conduction and excitation in nerve. J. Physiol. 117:500–544.

    Google Scholar 

  • Hollerman J, Schultz W(1996) Activity of dopamine neurons during learning in a familiar task context (Abstract). Soc. Neurosci. Abs. 22:1388.

    Google Scholar 

  • Hoover JE, Strick PL (1993) Multiple output channels in the basal ganglia. Science 259:819–821.

    Google Scholar 

  • Houk JC, Adams JL, Barto AG (1995) A model of how basal ganglia generate and use neural signals that predict reinforcement. In: JC Houk, JL Davis, DG Beiser, eds. Models of Information Processing in the Basal Ganglia. MIT Press, Cambridge, MA. pp. 249–270.

    Google Scholar 

  • Ivry RB (1996) The representation of temporal information in perception and motor control. Current Opinion in Neurobiology 6:851–857.

    Google Scholar 

  • Jaeger D, Kita H, Wilson CJ (1994) Surround inhibition among projection neurons is weak or nonexistent in the rat neostriatum. J. Neurophysiology 72:2555–2558.

    Google Scholar 

  • Jimenez-Castellanos J, Graybiel AM (1987) Subdivisions of the dopamine-containing A8-A9-A10 complex identified by their differential mesostriatal innervation of striosomes and extrastriosomal matrix. Neurosci. 23:223–242.

    Google Scholar 

  • Jimenez-Castellanos J, Graybiel AM (1989) Evidence that histochemically distinct zones of the primate substantia nigra pars compacta are related to patterned distributions of nigrostriatal projection neurons and striatonigral fibers. Exp. Brain Res. 74:227–238.

    Google Scholar 

  • Jueptner M, Rijntjes M, Weiller C, Faiss JH, Timmann D, Mueller SP, Diener HC (1995) Localization of a cerebellar timing process using PET. Neurology 45:1540–1545.

    Google Scholar 

  • Knowlton BJ, Mangels JA, Squire LR (1996) A neostriatal habit learning system in humans. Science 273:1399–1354.

    Google Scholar 

  • Kornhuber J, Kim J-S, Kornhuber ME, Kornhuber HH (1984) The cortico-nigral projection: Reduced glutamate content in the substantia nigra following frontal cortex ablation in the rat. Brain Res. 322:124–126.

    Google Scholar 

  • Kötter R, Wickens J (1995) Interactions of glutamate and dopamine in a computational model of the striatum. J. Computational Neuroscience 2:195–214.

    Google Scholar 

  • Künzle H (1978) An autoradiographic analysis of the efferent connections from premotor and adjacent prefrontal regions (Areas 6 and 9) in Macaca fascicularis. Brain Behav. Evol. 15:185–234.

    Google Scholar 

  • Levine DS, Prueitt PS (1989) Modeling some effects of frontal lobe damage: Novelty and perseveration. Neural Networks 2:103–116.

    Google Scholar 

  • Linden A, Bracke-Tolkmitt R, Lutzenberger W, Canavan AGM, Scholz E, Diener H-C, Birbaumer N (1990) Slow cortical potentials in parkinsonian patients during the course of an associative learning task. J. Psychophysiol. 4:145–162.

    Google Scholar 

  • Ljungberg T, Apicella P, Schultz W (1992) Responses of monkey dopamine neurons during learning of behavioral reactions. J. Neurophysiol. 67:145–163.

    Google Scholar 

  • Maricq AV, Church RM (1983) The differential effects of haloperidol and methamphetamine on time estimation in the rat. Psychopharmacology 79:10–15.

    Google Scholar 

  • Meck WH (1996) Neuropharmacology of timing and time perception. Cogn. Brain Res. 3:227–242.

    Google Scholar 

  • Milner B (1963) Effects of different brain lesions on card sorting. Arch. Neurol. 9:90–100.

    Google Scholar 

  • Milner B (1964) Some effects of frontal lobectomy in man. In: J Warren, K Akert, eds. The Frontal Granular Cortex and Behavior. McGraw-Hill, New York. pp. 313–334.

    Google Scholar 

  • Mirenowicz J, Schultz W (1994) Importance of unpredictability for reward responses in primate dopamine neurons. J. Neurophysiol. 72:1024–1027.

    Google Scholar 

  • Mirenowicz J, Schultz W (1996) Preferential activation of midbrain dopamine neurons by appetitive rather than aversive stimuli. Nature 379:449–451.

    Google Scholar 

  • Montague PR, Dayan P, Sejnowski TJ (1996) A framework for mesencephalic dopamine systems based on predictive hebbian learning. J. Neurosci. 16:1936–1947.

    Google Scholar 

  • Murase S, Grenhoff J, Chouvet G, Gonon FG, Svensson TH (1993) Prefrontal cortex regulates burst firing and transmitter release in rat mesolimbic dopamine neurons studied in vivo. Neurosci. Lett. 157:53–56.

    Google Scholar 

  • Nishino H, Ono T, Fukuda M, Sasaki K, Muramoto KI (1981) Single unit activity in monkey caudate nucleus during operant bar pressing feeding behavior. Neurosci. Lett. 21:105–110.

    Google Scholar 

  • Parent A, Hazrati L-N (1995) Functinal anatomy of the basal ganglia. I. The cortico-basal ganglia-thalamo-cortical loop. Brain Res. Rev. 20:91–127.

    Google Scholar 

  • Pastor MA, Artieda J, Jahanshahi M, Obeso JA (1992) Time estimation and reproduction is abnormal in Parkinson' disease. Brain 115:211–225.

    Google Scholar 

  • Perret SP, Ruiz BP, Mauk MD (1993) Cerebellar cortez lesions disrupt learning-dependent timing of conditioned eyelid responses. J. Neurosci. 13:1708–1718.

    Google Scholar 

  • Plenz D, Aertsen A (1996) Neural dynamics in cortex-striatum cocultures. II. Spatiotemporal characteristics of neuronal activity. Neuroscience 70:893–924.

    Google Scholar 

  • Porrino LJ, Goldman-Rakic PS (1982) Brainstem innervation of prefrontal and anterior cingulate cortex in the Rhesus monkey revealed by retrograde transport of HPR. J. Comp. Neurol. 205:63–76.

    Google Scholar 

  • Pucak ML, Grace AA (1994) Regulation of substantia nigra dopamine neurons. Critical Rev. Neurobiol. 9:67–89.

    Google Scholar 

  • Raijmakers MEJ, van der Maas HLJ, Molenaar PCM (1996) Numerical bifurcation analysis of distance-dependent on-center off-surround shunting neural networks. Biol. Cybern. 75:495–507.

    Google Scholar 

  • Rebec GV, Curtis SD (1988) Reciprocal zones of excitation and inhibition in the neostriatum. Synapse 2:633–635.

    Google Scholar 

  • Romo R, Schultz W (1990) Dopamine neurons of the monkey midbrain: Contingencies of responses to active touch during selfinitiated arm movements. J. Neurophysiol. 63:592–606.

    Google Scholar 

  • Russchen FT, Bakst I, Amaral DG, Price JL (1985) The amygdolostriatal projections in the monkey: An anterograde tracing study. Brain Res. 329:241–257.

    Google Scholar 

  • Sahakian B, Morris R, Evenden J, Heald A, Levy R, Philpot M, Robins T (1988) A comparative study of visuo-spatial memory and learning in Alzheimer-type dementia and Parkinson' disease. Brain 111:695–718.

    Google Scholar 

  • Schultz W (1986) Activity of pars reticulata neurons of monkey substantia nigra in relation to motor, sensory, and complex events. J. Neurophysiol. 55:660–677.

    Google Scholar 

  • Schultz W, Apicella P, Ljungberg T (1993) Responses of monkey dopamine neurons to reward and conditioned stimuli during successive steps of learning a delayed response task. J. Neurosci. 13:900–913.

    Google Scholar 

  • Schultz W, Apicella P, Scarnati E, Ljungberg T (1992) Neuronal activity in monkey ventral striatum related to the expectation of reward. J. Neurophysiol. 12:4595–4610.

    Google Scholar 

  • Schultz W, Dayan P, Montague PR (1997) A neural substrate of prediction and reward. Science 275:1593–1598.

    Google Scholar 

  • Schultz W, Romo R (1990) Dopamine neurons of the monkey midbrain: Contingencies of responses to stimuli eliciting immediate behavioral reactions. J. Neurophysiol. 63:607–624.

    Google Scholar 

  • Sherman SM, Guillery RW (1996) Functional organization of thalamocortical relays. J. Neurophysiol. 76:1367–1395.

    Google Scholar 

  • Smith Y, Charara A, Parent A (1996) Synaptic innervation of midbrain dopaminergic neurons by glutamate-enriched terminals in the squirrel monkey. J. Comp. Neurol. 364:231–253.

    Google Scholar 

  • Sprengelmeyer R, Canavan AGM, Lange HW, Homberg V (1995) Associative learning in degenerative neostriatal disorders: Contrasts in explicit and implicit remembering between Parkinson' and Huntington' diseases. Mov. Disorders 10:51–65.

    Google Scholar 

  • Strick PL, Dum RP, Mushiake H (1995) Basal ganglia “loops” with the cerebral cortex. In: M Kimura, AM Graybiel, eds. Functions of the Cortico-Basal Ganglia Loop. Springer-Verlag, Tokyo. pp. 106–124.

    Google Scholar 

  • Suri RE, Schultz W (1996) A neural learning model based on the activity of primate dopamine neurons (Abstrac). Soc. Neurosci.Abs. 22:1389.

    Google Scholar 

  • Sutton RS (1988) Learning to predict by the methods of temporal differences. Machine Learning 3:9–44.

    Google Scholar 

  • Sutton RS, Barto AG (1981) Toward a modern theory of adaptive networks: Expectation and prediction. Psychol. Rev. 88:135–171.

    Google Scholar 

  • Tong ZY, Overton PG, Clark D (1996) Stimulation of the prefrontal cortex in the rat induces patterns of activity in midbrain dopaminergic neurons which resemble natural burst events. Synapse 22:195–208.

    Google Scholar 

  • Watanabe M (1990) Prefrontal unit activity during associative learning in the monkey. Exp. Brain Res. 80:296–309.

    Google Scholar 

  • Watanabe M (1996) Reward expectancy in primate prefrontal neurons. Nature 382:629–632.

    Google Scholar 

  • Wichmann T, Vitek JL, DeLong MR (1995) Parkinson' disease and the basal ganglia: Lessons from the laboratory and from neurosurgery. Neuroscientist 1:236–244.

    Google Scholar 

  • Wickens JR, Alexander ME, Miller R. (1991) Two dynamic modes of striatal function unders dopaminergic-cholinergic control: Simulation and analysis of a model. Synapse 8:1–12.

    Google Scholar 

  • Wickens JR, Begg AJ, Arbuthnott GW (1996) Dopamine reverses the depression of rat corticostriatal synapses which normally follows high-frequency stimulation of cortex in vitro. Neuroscience 70:1–5.

    Google Scholar 

  • Young WS III, Alheid GF, Heimer L (1984) The ventral pallidal projection to the mediodorsal thalamus: A study with fluorescent retrograde tracers and immunohisto-fluorescence. J. Neurosci. 4:1626–1638.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

About this article

Cite this article

Contreras-Vidal, J.L., Schultz, W. A Predictive Reinforcement Model of Dopamine Neurons for Learning Approach Behavior. J Comput Neurosci 6, 191–214 (1999). https://doi.org/10.1023/A:1008862904946

Download citation

  • Issue Date:

  • DOI: https://doi.org/10.1023/A:1008862904946

Navigation