Abstract
Companion-Systems interact with users via flexible, goal-directed dialogs. During dialogs both, user and Companion-System, can identify and communicate their goals iteratively. In that sense, they can be conceptualized as communication partners, equipped with a processing scheme producing actions as outputs in consequence of (1) inputs from the other communication partner and (2) internally represented goals. A quite general core competence of communication partners is the capability for strategy change, defined as the modification of action planning under the boundary condition of maintaining a constant goal. Interestingly, the biological fundamentals for this capability are largely unknown. Here we describe a research program that employs an animal model for strategy change to (1) investigate its underlying neuronal mechanisms and (2) describe these mechanisms in an algorithmic syntax, suitable for implementation in technical Companion-Systems. It is crucial for this research program that investigated scenarios be sufficiently complex to contain all relevant aspects of strategy change, but at the same time simple enough to allow for a detailed neurophysiological analysis only obtainable in animal models. To this end, two forms of strategy change are considered in detail: Strategy change caused by modified feature selection, and strategy change caused by modified action assignment.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Amemori, K.I., Gibb, L.G., Graybiel, A.M.: Shifting responsibly: the importance of striatal modularity to reinforcement learning in uncertain environments. Front. Hum. Neurosci. 5, 47 (2011). doi:10.3389/fnhum.2011.00047. http://dx.doi.org/10.3389/fnhum.2011.00047
Bathellier, B., Tee, S.P., Hrovat, C., Rumpel, S.: A multiplicative reinforcement learning model capturing learning dynamics and interindividual variability in mice. Proc. Natl. Acad. Sci. USA 110(49), 19950–19955 (2013). doi:10.1073/pnas.1312125110. http://dx.doi.org/10.1073/pnas.1312125110
Bitterman, M.E.: The comparative analysis of learning. Science 188(4189), 699–709 (1975). doi:10.1126/science.188.4189.699. http://dx.doi.org/10.1126/science.188.4189.699
Bond, A.B., Kamil, A.C., Balda, R.P.: Serial reversal learning and the evolution of behavioral flexibility in three species of North American corvids (Gymnorhinus cyanocephalus, Nucifraga columbiana, Aphelocoma californica). J. Comp. Psychol. 121(4), 372–379 (2007). doi:10.1037/0735-7036.121.4.372. http://dx.doi.org/10.1037/0735-7036.121.4.372
Boulougouris, V., Dalley, J.W., Robbins, T.W.: Effects of orbitofrontal, infralimbic and prelimbic cortical lesions on serial spatial reversal learning in the rat. Behav. Brain. Res. 179(2), 219–228 (2007). doi:10.1016/j.bbr.2007.02.005. http://dx.doi.org/10.1016/j.bbr.2007.02.005
Budinger, E., Laszcz, A., Lison, H., Scheich, H., Ohl, F.W.: Non-sensory cortical and subcortical connections of the primary auditory cortex in mongolian gerbils: bottom-up and top-down processing of neuronal information via field ai. Brain Res. 1220, 2–32 (2008). doi:10.1016/j.brainres.2007.07.084. http://dx.doi.org/10.1016/j.brainres.2007.07.084
Bussey, T.J., Muir, J.L., Everitt, B.J., Robbins, T.W.: Triple dissociation of anterior cingulate, posterior cingulate, and medial frontal cortices on visual discrimination tasks using a touchscreen testing procedure for the rat. Behav. Neurosci. 111(5), 920–936 (1997)
Castañé, A., Theobald, D.E.H., Robbins, T.W.: Selective lesions of the dorsomedial striatum impair serial spatial reversal learning in rats. Behav. Brain. Res. 210(1), 74–83 (2010). doi:10.1016/j.bbr.2010.02.017. http://dx.doi.org/10.1016/j.bbr.2010.02.017
Clayton, K.N.: The relative effects of forced reward and forced nonreward during widely spaced successive discrimination reversal. J. Comput. Physiol. Psychol. 55, 992–997 (1962)
Dabrowska, J.: Multiple reversal learning in frontal rats. Acta Biol. Exp. (Warsz) 24, 99–102 (1964)
Deco, G., Rolls, E.T.: Synaptic and spiking dynamics underlying reward reversal in the orbitofrontal cortex. Cereb. Cortex 15(1), 15–30 (2005). doi:10.1093/cercor/bhh103. http://dx.doi.org/10.1093/cercor/bhh103
Dias, R., Robbins, T.W., Roberts, A.C.: Primate analogue of the Wisconsin card sorting test: effects of excitotoxic lesions of the prefrontal cortex in the marmoset. Behav. Neurosci. 110(5), 872–886 (1996)
Dias, R., Robbins, T.W., Roberts, A.C.: Dissociable forms of inhibitory control within prefrontal cortex with an analog of the Wisconsin card sort test: restriction to novel situations and independence from “on-line” processing. J. Neurosci. 17(23), 9285–9297 (1997)
Divac, I.: Frontal lobe system and spatial reversal in the rat. Neuropsychologia 9(2), 175–183 (1971)
Dombrowski, P.A., Maia, T.V., Boschen, S.L., Bortolanza, M., Wendler, E., Schwarting, R.K.W., Brandão, M.L., Winn, P., Blaha, C.D., Cunha, C.D.: Evidence that conditioned avoidance responses are reinforced by positive prediction errors signaled by tonic striatal dopamine. Behav. Brain Res. 241, 112–119 (2013). doi:10.1016/j.bbr.2012.06.031. http://dx.doi.org/10.1016/j.bbr.2012.06.031
Feldman, J.: Successive discrimination reversal performance as a function of level of drive and incentive. Psychon. Sci. 13(5), 265–266 (1968). doi:10.3758/BF03342516. http://dx.doi.org/10.3758/BF03342516
Fellows, L.K.: Orbitofrontal contributions to value-based decision making: evidence from humans with frontal lobe damage. Ann. N. Y. Acad. Sci. 1239, 51–58 (2011). doi:10.1111/j.1749-6632.2011.06229.x. http://dx.doi.org/10.1111/j.1749-6632.2011.06229.x
Ferry, A.T., Lu, X.C., Price, J.L.: Effects of excitotoxic lesions in the ventral striatopallidal–thalamocortical pathway on odor reversal learning: inability to extinguish an incorrect response. Exp. Brain Res. 131(3), 320–335 (2000)
Frank, M.J.: Dynamic dopamine modulation in the basal ganglia: a neurocomputational account of cognitive deficits in medicated and nonmedicated parkinsonism. J. Cogn. Neurosci. 17(1), 51–72 (2005). doi:10.1162/0898929052880093. http://dx.doi.org/10.1162/0898929052880093
Frank, M.J., Seeberger, L.C., O’Reilly, R.C.: By carrot or by stick: cognitive reinforcement learning in parkinsonism. Science 306(5703), 1940–1943 (2004). doi:10.1126/science.1102941. http://dx.doi.org/10.1126/science.1102941
Garner, H., Wessinger, W., McMillan, D.: Effect of multiple discrimination reversals on acquisition of a drug discrimination task in rats. Behav. Pharmacol. 7(2), 200–204 (1996)
Gossette, R.L., Hood, P.: Successive discrimination reversal measures as a function of variation of motivational and incentive levels. Percept. Mot. Skills 26(1), 47–52 (1968). doi:10.2466/pms.1968.26.1.47. http://dx.doi.org/10.2466/pms.1968.26.1.47
Gossette, R.L., Inman, N.: Comparison of spatial successive discrimination reversal performances of two groups of new world monkeys. Percept. Mot. Skills 23(1), 169–170 (1966). doi:10.2466/pms.1966.23.1.169. http://dx.doi.org/10.2466/pms.1966.23.1.169
Haber, S.N., Calzavara, R.: The cortico-basal ganglia integrative network: the role of the thalamus. Brain Res. Bull. 78(2-3), 69–74 (2009). doi:10.1016/j.brainresbull.2008.09.013. http://dx.doi.org/10.1016/j.brainresbull.2008.09.013
Hamilton, D.A., Brigman, J.L.: Behavioral flexibility in rats and mice: contributions of distinct frontocortical regions. Genes Brain Behav. 14(1), 4–21 (2015). doi:10.1111/gbb.12191. http://dx.doi.org/10.1111/gbb.12191
Houk, J.C.: Agents of the mind. Biol. Cybern. 92(6), 427–437 (2005). doi:10.1007/s00422-005-0569-8. http://dx.doi.org/10.1007/s00422-005-0569-8
Houk, J.C., Wise, S.P.: Distributed modular architectures linking basal ganglia, cerebellum, and cerebral cortex: their role in planning and controlling action. Cereb. Cortex 5(2), 95–110 (1995)
Ilango, A., Wetzel, W., Scheich, H., Ohl, F.W.: The combination of appetitive and aversive reinforcers and the nature of their interaction during auditory learning. Neuroscience 166(3), 752–762 (2010). doi:10.1016/j.neuroscience.2010.01.010. http://dx.doi.org/10.1016/j.neuroscience.2010.01.010
Ilango, A., Shumake, J., Wetzel, W., Scheich, H., Ohl, F.W.: Effects of ventral tegmental area stimulation on the acquisition and long-term retention of active avoidance learning. Behav. Brain Res. 225(2), 515–521 (2011). doi:10.1016/j.bbr.2011.08.014. http://dx.doi.org/10.1016/j.bbr.2011.08.014
Ilango, A., Shumake, J., Wetzel, W., Scheich, H., Ohl, F.W.: The role of dopamine in the context of aversive stimuli with particular reference to acoustically signaled avoidance learning. Front. Neurosci. 6, 132 (2012)
Ilango, A., Shumake, J., Wetzel, W., Ohl, F.W.: Contribution of emotional and motivational neurocircuitry to cue-signaled active avoidance learning. Front. Behav. Neurosci. 8, 372 (2014). doi:10.3389/fnbeh.2014.00372. http://dx.doi.org/10.3389/fnbeh.2014.00372
Ionescu, T.: Exploring the nature of cognitive flexibility. New Ideas Psychol. 30(2), 190–200 (2012). doi:10.1016/j.newideapsych.2011.11.001. http://dx.doi.org/10.1016/j.newideapsych.2011.11.001
Jonker, F.A., Jonker, C., Scheltens, P., Scherder, E.J.A.: The role of the orbitofrontal cortex in cognition and behavior. Rev. Neurosci. 26(1), 1–11 (2015). doi:10.1515/revneuro-2014-0043. http://dx.doi.org/10.1515/revneuro-2014-0043
Kangas, B.D., Bergman, J.: Repeated acquisition and discrimination reversal in the squirrel monkey (Saimiri sciureus). Anim. Cogn. 17(2), 221–228 (2014). doi:10.1007/s10071-013-0654-7. http://dx.doi.org/10.1007/s10071-013-0654-7
Kehagia, A.A., Murray, G.K., Robbins, T.W.: Learning and cognitive flexibility: frontostriatal function and monoaminergic modulation. Curr. Opin. Neurobiol. 20(2), 199–204 (2010). doi:10.1016/j.conb.2010.01.007. http://dx.doi.org/10.1016/j.conb.2010.01.007
Kulig, B.M., Calhoun, W.H.: Enhancement of successive discrimination reversal learning by methamphetamine. Psychopharmacologia 27(3), 233–240 (1972)
Li, L., Shao, J.: Restricted lesions to ventral prefrontal subareas block reversal learning but not visual discrimination learning in rats. Physiol. Behav. 65(2), 371–379 (1998)
Mackintosh, N., Cauty, A.: Spatial reversal learning in rats, pigeons, and goldfish. Psychon. Sci. 22, 281–282 (1971)
Mackintosh, N.J., McGonigle, B., Holgate, V., Vanderver, V.: Factors underlying improvement in serial reversal learning. Can. J. Psychol. 22(2), 85–95 (1968)
McAlonan, K., Brown, V.J.: Orbital prefrontal cortex mediates reversal learning and not attentional set shifting in the rat. Behav. Brain Res. 146(1-2), 97–103 (2003)
McDannald, M.A., Jones, J.L., Takahashi, Y.K., Schoenbaum, G.: Learning theory: a driving force in understanding orbitofrontal function. Neurobiol. Learn. Mem. 108, 22–27 (2014). doi:10.1016/j.nlm.2013.06.003. http://dx.doi.org/10.1016/j.nlm.2013.06.003
McGeorge, A.J., Faull, R.L.: The organization of the projection from the cerebral cortex to the striatum in the rat. Neuroscience 29(3), 503–37 (1989). http://www.ncbi.nlm.nih.gov/pubmed/2472578
McHaffie, J.G., Stanford, T.R., Stein, B.E., Coizet, V., Redgrave, P.: Subcortical loops through the basal ganglia. Trends Neurosci. 28(8), 401–407 (2005). doi:10.1016/j.tins.2005.06.006. http://dx.doi.org/10.1016/j.tins.2005.06.006
Montague, P.R., Dayan, P., Sejnowski, T.J.: A framework for mesencephalic dopamine systems based on predictive Hebbian learning. J. Neurosci. 16(5), 1936–1947 (1996)
Mota, T., Giurfa, M.: Multiple reversal olfactory learning in honeybees. Front. Behav. Neurosci. 4 (2010). doi:10.3389/fnbeh.2010.00048. http://dx.doi.org/10.3389/fnbeh.2010.00048
Mowrer, O.H.: Two-factor learning theory reconsidered, with special reference to secondary reinforcement and the concept of habit. Psychol. Rev. 63(2), 114–128 (1956)
Nolte, G., Bai, O., Wheaton, L., Mari, Z., Vorbach, S., Hallett, M.: Identifying true brain interaction from eeg data using the imaginary part of coherency. Clin. Neurophysiol. 115(10), 2292–2307 (2004). doi:10.1016/j.clinph.2004.04.029. http://dx.doi.org/10.1016/j.clinph.2004.04.029
O’Doherty, J., Dayan, P., Schultz, J., Deichmann, R., Friston, K., Dolan, R.J.: Dissociable roles of ventral and dorsal striatum in instrumental conditioning. Science 304(5669), 452–454 (2004). doi:10.1126/science.1094285. http://dx.doi.org/10.1126/science.1094285
Ohl, F.W., Scheich, H., Freeman, W.J.: Change in pattern of ongoing cortical activity with auditory category learning. Nature 412(6848), 733–736 (2001). doi:10.1038/35089076. http://dx.doi.org/10.1038/35089076
Pennartz, C.M.A., Berke, J.D., Graybiel, A.M., Ito, R., Lansink, C.S., van der Meer, M., Redish, A.D., Smith, K.S., Voorn, P.: Corticostriatal interactions during learning, memory processing, and decision making. J. Neurosci. 29(41), 12831–12838 (2009). doi:10.1523/JNEUROSCI.3177-09.2009. http://dx.doi.org/10.1523/JNEUROSCI.3177-09.2009
Piray, P.: The role of dorsal striatal d2-like receptors in reversal learning: a reinforcement learning viewpoint. J. Neurosci. 31(40), 14049–14050 (2011). doi:10.1523/JNEUROSCI.3008-11.2011. http://dx.doi.org/10.1523/JNEUROSCI.3008-11.2011
Pubols, B. Jr.: Successive discrimination reversal learning in the white rat: a comparison of two procedures. J. Comput. Physiol. Psychol. 50(3), 319–322 (1957)
Pubols, B.H.: Serial reversal learning as a function of the number of trials per reversal. J. Comput. Physiol. Psychol. 55, 66–68 (1962)
Ragozzino, M.E.: Acetylcholine actions in the dorsomedial striatum support the flexible shifting of response patterns. Neurobiol. Learn. Mem. 80(3), 257–267 (2003)
Remijnse, P.L., Nielen, M.M.A., Uylings, H.B.M., Veltman, D.J.: Neural correlates of a reversal learning task with an affectively neutral baseline: an event-related fMRI study. Neuroimage 26(2), 609–618 (2005). doi:10.1016/j.neuroimage.2005.02.009. http://dx.doi.org/10.1016/j.neuroimage.2005.02.009
Rodgers, C.C., DeWeese, M.R.: Neural correlates of task switching in prefrontal cortex and primary auditory cortex in a novel stimulus selection task for rodents. Neuron 82(5), 1157–1170 (2014). doi:10.1016/j.neuron.2014.04.031. http://dx.doi.org/10.1016/j.neuron.2014.04.031
Schoenbaum, G., Nugent, S.L., Saddoris, M.P., Setlow, B.: Orbitofrontal lesions in rats impair reversal but not acquisition of go, no-go odor discriminations. Neuroreport 13(6), 885–890 (2002)
Schoenbaum, G., Setlow, B., Nugent, S.L., Saddoris, M.P., Gallagher, M.: Lesions of orbitofrontal cortex and basolateral amygdala complex disrupt acquisition of odor-guided discriminations and reversals. Learn. Mem. 10(2), 129–140 (2003). doi:10.1101/lm.55203. http://dx.doi.org/10.1101/lm.55203
Schultz, W.: The reward signal of midbrain dopamine neurons. News Physiol. Sci. 14, 249–255 (1999)
Schultz, W.: Reward signaling by dopamine neurons. Neuroscientist 7(4), 293–302 (2001)
Schultz, W., Dayan, P., Montague, P.R.: A neural substrate of prediction and reward. Science 275(5306), 1593–1599 (1997)
Schulz, A.L., Woldeit, M.L., Gonçalves, A.I., Saldeitis, K., Ohl, F.W.: Selective increase of auditory cortico-striatal coherence during auditory-cued go/nogo discrimination learning. Front. Behav. Neurosci. 9(368) (2016). doi:10.3389/fnbeh.2015.00368
Smith, Y., Surmeier, D.J., Redgrave, P., Kimura, M.: Thalamic contributions to basal ganglia-related behavioral switching and reinforcement. J. Neurosci. 31(45), 16102–16106 (2011). doi:10.1523/JNEUROSCI.4634-11.2011. http://dx.doi.org/10.1523/JNEUROSCI.4634-11.2011
Sutton, R.S., Barto, A.G.: Reinforcement Learning: An Introduction. MIT, Cambridge, MA (1998)
Tremblay, L., Hollerman, J.R., Schultz, W.: Modifications of reward expectation-related neuronal activity during learning in primate striatum. J. Neurophysiol. 80(2), 964–977 (1998)
von der Gablentz, J., Tempelmann, C., Münte, T.F., Heldmann, M.: Performance monitoring and behavioral adaptation during task switching: an fMRI study. Neuroscience 285, 227–235 (2015). doi:10.1016/j.neuroscience.2014.11.024. http://dx.doi.org/10.1016/j.neuroscience.2014.11.024
Voorn, P., Vanderschuren, L.J.M.J., Groenewegen, H.J., Robbins, T.W., Pennartz, C.M.a.: Putting a spin on the dorsal-ventral divide of the striatum. Trends Neurosci. 27(8), 468–74 (2004). doi:10.1016/j.tins.2004.06.006. http://www.ncbi.nlm.nih.gov/pubmed/15271494
Walton, M.E., Behrens, T.E.J., Noonan, M.P., Rushworth, M.F.S.: Giving credit where credit is due: orbitofrontal cortex and valuation in an uncertain world. Ann. N. Y. Acad. Sci. 1239, 14–24 (2011). doi:10.1111/j.1749-6632.2011.06257.x. http://dx.doi.org/10.1111/j.1749-6632.2011.06257.x
Wassum, K.M., Izquierdo, A.: The basolateral amygdala in reward learning and addiction. Neurosci. Biobehav. Rev. 57, 271–283 (2015). doi:10.1016/j.neubiorev.2015.08.017. http://dx.doi.org/10.1016/j.neubiorev.2015.08.017
Woldeit, M.L., Schulz, A.L., Ohl, F.W.: Phase de-synchronization effects auditory gating in the ventral striatum but not auditory cortex. Neuroscience 216, 70–81 (2012). doi:10.1016/j.neuroscience.2012.04.058. http://dx.doi.org/10.1016/j.neuroscience.2012.04.058
Xiong, Q., Znamenskiy, P., Zador, A.M.: Selective corticostriatal plasticity during acquisition of an auditory discrimination task. Nature (2015). doi:10.1038/nature14225. http://dx.doi.org/10.1038/nature14225
Xue, G., Xue, F., Droutman, V., Lu, Z.L., Bechara, A., Read, S.: Common neural mechanisms underlying reversal learning by reward and punishment. PLoS One 8(12), e82169 (2013). doi:10.1371/journal.pone.0082169. http://dx.doi.org/10.1371/journal.pone.0082169
Acknowledgements
This work was done within the Transregional Collaborative Research Centre SFB/TRR 62 “Companion-Technology for Cognitive Technical Systems” funded by the German Research Foundation (DFG).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2017 Springer International Publishing AG
About this chapter
Cite this chapter
Schulz, A.L., Woldeit, M.L., Ohl, F.W. (2017). Neurobiological Fundamentals of Strategy Change: A Core Competence of Companion-Systems. In: Biundo, S., Wendemuth, A. (eds) Companion Technology. Cognitive Technologies. Springer, Cham. https://doi.org/10.1007/978-3-319-43665-4_8
Download citation
DOI: https://doi.org/10.1007/978-3-319-43665-4_8
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-43664-7
Online ISBN: 978-3-319-43665-4
eBook Packages: Computer ScienceComputer Science (R0)