Skip to main content

Reinforcement Learning and Attractor Neural Network Models of Associative Learning

  • Conference paper
  • First Online:
Computational Intelligence (IJCCI 2017)

Part of the book series: Studies in Computational Intelligence ((SCI,volume 829))

Included in the following conference series:

Abstract

Despite indisputable advances in reinforcement learning (RL) research, some cognitive and architectural challenges still remain. The primary source of challenges in the current conception of RL stems from the theory’s way to define states. Whereas states under laboratory conditions are tractable (due to the Markov property), states in real-world RL are high-dimensional, continuous and partially observable. Hence, effective learning and generalization can be guaranteed if the subset of reward relevant dimensions were correctly identified for each state. Moreover, the computational discrepancy between model-free and model-based RL methods creates a stability-plasticity dilemma in terms of how to guide optimal decision-making control in case of interactive and competitive multiple systems, each of which implements different type of RL methods. By showing behavioral results of how human subjects flexibly define states in a reversal learning paradigm contrary to a simple RL model, we argue that these challenges can be met by infusing the RL framework as an algorithmic theory of human behavior with the strengths of the attractor framework at the level of neural implementation. Our position is supported by the hypothesis that ‘attractor states’ which are stable patterns of self-sustained and reverberating brain activity, are a manifestation of the collective dynamics of neuronal populations in the brain. With its capacity of pattern-completion along with the ability to link events in temporal order, an attractor network becomes relatively insensitive to noise allowing to account for sparse data which is characteristic to high-dimensional and continuous real-world RL.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 109.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Sutton, R.S., Barto, A.G.: Reinforcement Learning: An Introduction. The MIT Press, Cambridge, Massachusetts (1998)

    MATH  Google Scholar 

  2. Daw, N., Niv, Y., Dayan, P.: Uncertainty-based competition between prefrontal and dorsolateral striatal systems for behavioral control. Nat. Neurosci. 8, 1704–1711 (2005)

    Google Scholar 

  3. Lewis, F.L., Vrabie, D.: Reinforcement learning and adaptive dynamic programming for feedback control. IEEE Circuits Syst. Mag. 9, 32–50 (2009)

    Google Scholar 

  4. van Otterlo, M., Wiering, M.: Reinforcement learning and markov decision processes. In: Wiering, M., van Otterlo, M. (eds.) Reinforcement Learning: State of the Art, pp. 3–42. Springer, Berlin, Heidelberg (2012)

    Google Scholar 

  5. Krigolson, O.E., Hassall, C.D., Handy, T.C.: How we learn to make decisions: rapid propagation of reinforcement learning prediction errors in humans. J. Cogn. Neurosci. 26, 635–644 (2014)

    Google Scholar 

  6. Marsland, S.: Machine Learning: An Algorithmic Perspective. Chapman and Hall/CRC press, Boca Raton (2015)

    Google Scholar 

  7. Schultz, W., Dayan, P., Montague, P.R.: A neural substrate of prediction and reward. Science 275, 1593–1599 (1997)

    Google Scholar 

  8. Doya, K.: Reinforcement learning: computational theory and biological mechanisms. HFSP J. 1, 30–40 (2007)

    Google Scholar 

  9. Niv, Y.: Reinforcement learning in the brain. J. Math. Psychol. 53, 139–154 (2009)

    MathSciNet  MATH  Google Scholar 

  10. Shteingart, H., Neiman, T., Loewenstein, Y.: The role of first impression in operant learning. J. Exp. Psychol. Gen. 142, 476 (2013)

    Google Scholar 

  11. Pong, V., Gu, S., Dalal, M., Levine, S.: Temporal difference models: Model-free deep rl for model-based control (2018). arXiv preprint arXiv:1802.09081

  12. Mnih, V., Kavukcuoglu, K., Silver, D., Rusu, A.A., Veness, J., Bellemare, M.G., Graves, A., Riedmiller, M., Fidjeland, A.K., Ostrovski, G., et al.: Human-level control through deep reinforcement learning. Nature 518, 529–533 (2015)

    Google Scholar 

  13. Knight, W.: Reinforcement learning: by experimenting, computers are figuring out how to do things that no programmer could teach them. MIT Technol. Rev. 120, 32–35 (2017)

    Google Scholar 

  14. Kempka, M., Wydmuch, M., Runc, G., Toczek, J., Jakowski, W.: Vizdoom: A doom-based ai research platform for visual reinforcement learning. In: 2016 IEEE Conference on Computational Intelligence and Games (CIG), pp. 1–8 (2016)

    Google Scholar 

  15. Zhang, T., Kahn, G., Levine, S., Abbeel, P.: Learning deep control policies for autonomous aerial vehicles with mpc-guided policy search. In: 2016 IEEE International Conference on Robotics and Automation (ICRA), pp. 528–535, IEEE (2016)

    Google Scholar 

  16. Gershman, S.J., Daw, N.D.: Reinforcement learning and episodic memory in humans and animals: an integrative framework. Annu. Rev. Psychol. 68, 101–128 (2017)

    Google Scholar 

  17. Lieberman, M.D., Gaunt, R., Gilbert, D.T., Trope, Y.: Reflexion and reflection: a social cognitive neuroscience approach to attributional inference. Advances in Experimental Social Psychology, pp. 199–249. Elsevier, Amsterdam (2002)

    Google Scholar 

  18. Gallistel, C.R., King, A.P.: Memory and the Computational Brain, 1st edn. Wiley-Blackwell, West Sussex, United Kingdom (2009)

    Google Scholar 

  19. Kahneman, D.: Thinking, Fast and Slow. Macmillan, London (2011)

    Google Scholar 

  20. Dayan, P., Berridge, K.C.: Model-based and model-free pavlovian reward learning: revaluation, revision, and revelation. Cogn. Affect. Behav. Neurosci. 14, 473–492 (2014)

    Google Scholar 

  21. Davidson, R.J., Begley, S.: The Emotional Life of Your Brain: How Its Unique Patterns Affect the Way You Think, Feel, and Live-and How You can Change Them. Hudson Street Press, Penguin Group, New York (2012)

    Google Scholar 

  22. Phelps, E.A., Lempert, K.M., Sokol-Hessner, P.: Emotion and decision making: multiple modulatory neural circuits. Annu. Rev. Neurosci. 37, 263–287 (2014)

    Google Scholar 

  23. Dolan, R.J., Dayan, P.: Goals and habits in the brain. Neuron 80, 312–325 (2013)

    Google Scholar 

  24. Reynolds, S.J.: A neurocognitive model of the ethical decision-making process: implications for study and practice. J. Appl. Psychol. 91, 737–748 (2006)

    Google Scholar 

  25. Hamid, O.H.: A model-based Markovian context-dependent reinforcement learning approach for neurobiologically plausible transfer of experience. Int. J. Hybrid Intell. Syst. 12, 119–129 (2015)

    Google Scholar 

  26. Friedel, E., Koch, S.P., Wendt, J., Heinz, A., Deserno, L., Schlagenhauf, F.: Devaluation and sequential decisions: linking goal-directed and model-based behavior. Habits: plasticity, learning and freedom (2015)

    Google Scholar 

  27. Balleine, B.W., Delgado, M.R., Hikosaka, O.: The role of the dorsal striatum in reward and decision-making. J. Neurosci. 27, 8161–8165 (2007)

    Google Scholar 

  28. Adolphs, R.: Social cognition and the human brain. Trends Cogn. Sci. 3, 469–479 (1999)

    Google Scholar 

  29. Knutson, B., Adams, C.M., Fong, G.W., Hommer, D.: Anticipation of increasing monetary reward selectively recruits nucleus accumbens. J. Neurosci. 21, RC159–RC159 (2001)

    Google Scholar 

  30. Padmala, S., Sirbu, M., Pessoa, L.: Potential reward reduces the adverse impact of negative distractor stimuli. Soc. Cogn. Affect. Neurosci. 12, 1402–1413 (2017)

    Google Scholar 

  31. Waltz, J.A., Knowlton, B.J., Holyoak, K.J., Boone, K.B., Mishkin, F.S., de Menezes Santos, M., Thomas, C.R., Miller, B.L.: A system for relational reasoning in human prefrontal cortex. Psychol. Sci. 10, 119–125 (1999)

    Google Scholar 

  32. Bunge, S.A., Helskog, E.H., Wendelken, C.: Left, but not right, rostrolateral prefrontal cortex meets a stringent test of the relational integration hypothesis. NeuroImage 46, 338–342 (2009)

    Google Scholar 

  33. Cole, M.W., Yarkoni, T., Repovš, G., Anticevic, A., Braver, T.S.: Global connectivity of prefrontal cortex predicts cognitive control and intelligence. J. Neurosci. 32, 8988–8999 (2012)

    Google Scholar 

  34. Szczepanski, S.M., Knight, R.T.: Insights into human behavior from lesions to the prefrontal cortex. Neuron 83, 1002–1018 (2014)

    Google Scholar 

  35. Mante, V., Sussillo, D., Shenoy, K.V., Newsome, W.T.: Context-dependent computation by recurrent dynamics in prefrontal cortex. Nature 503, 78–84 (2013)

    Google Scholar 

  36. Moscovitch, M., Cabeza, R., Winocur, G., Nadel, L.: Episodic memory and beyond: the hippocampus and neocortex in transformation. Annu. Rev. Psychol. 67, 105–134 (2016)

    Google Scholar 

  37. Javadi, A.H., Emo, B., Howard, L.R., Zisch, F.E., Yu, Y., Knight, R., Silva, J.P., Spiers, H.J.: Hippocampal and prefrontal processing of network topology to simulate the future. Nat. Commun. 8, 1–11 (2017)

    Google Scholar 

  38. Marr, D., Vision, A.: A Computational Investigation into the Human Representation and Processing of Visual Information, vol. 1. Freeman and Company, WH San Francisco (1982)

    Google Scholar 

  39. Mermillod, M., Bugaiska, A., Bonin, P.: The stability-plasticity dilemma: investigating the continuum from catastrophic forgetting to age-limited learning effects. Front. Psychol. 4 (2013)

    Google Scholar 

  40. Hamid, O.H., Braun, J.: Relative importance of sensory and motor events in reinforcement learning. Percept. ECVP Abstr. 39, 48–48 (2010)

    Google Scholar 

  41. Hamid, O.H., Wendemuth, A., Braun, J.: Temporal context and conditional associative learning. BMC Neurosci. 11, 1–16 (2010)

    Google Scholar 

  42. Amit, D.J., Brunel, N., Tsodyks, M.V.: Correlations of cortical hebbian reverberations: theory versus experiment. J. Neurosci. 14, 6435–6445 (1994)

    Google Scholar 

  43. Hopfield, J.J.: Neural networks and physical systems with emergent collective computational abilities. Proc. Natl. Acad. Sci. 79, 2554–2558 (1982)

    MathSciNet  MATH  Google Scholar 

  44. Braun, J., Mattia, M.: Attractors and noise: twin drivers of decisions and multistability. NeuroImage 52, 740–751 (2010). Computational Models of the Brain

    Google Scholar 

  45. Thorndike, E.L.: Animal intelligence: an experimental study of the associative processes in animals. Psychol. Rev. Monogr. Suppl. 2, i (1898)

    Google Scholar 

  46. Tolman, E.C.: Cognitive maps in rats and men. Psychol. Rev. 55, 189–208 (1948)

    Google Scholar 

  47. Muenzinger, K.F., Gentry, E.: Tone discrimination in white rats. J. Comp. Psychol. 12, 195–206 (1931)

    Google Scholar 

  48. Tolman, E.C.: Prediction of vicarious trial and error by means of the schematic sowbug. Psychol. Rev. 46, 318–336 (1939)

    Google Scholar 

  49. Redish, A.D.: Vicarious trial and error. Nat. Rev. Neurosci. 17, 147 (2016)

    Google Scholar 

  50. Dayan, P., Niv, Y.: Reinforcement learning: the good, the bad and the ugly. Curr. Opin. Neurobiol. 18, 185–196 (2008)

    Google Scholar 

  51. Bertsekas, D.P., Tsitsiklis, J.N.: Neuro-Dynamic Programming. Athena Scientific, Massachusetts (1996)

    MATH  Google Scholar 

  52. van der Ree, M., Wiering, M.: Reinforcement learning in the game of othello: learning against a fixed opponent and learning from self-play. In: 2013 IEEE Symposium on Adaptive Dynamic Programming and Reinforcement Learning (ADPRL), IEEE, pp. 108–115 (2013)

    Google Scholar 

  53. Castro-González, Á., Malfaz, M., Gorostiza, J.F., Salichs, M.A.: Learning behaviors by an autonomous social robot with motivations. Cybern. Syst. 45, 568–598 (2014)

    Google Scholar 

  54. Kaelbling, L.P., Littman, M.L., Moore, A.W.: Reinforcement learning: a survey. J. Artif. Intell. Res. 4, 237–285 (1996)

    Google Scholar 

  55. Maia, T.V.: Reinforcement learning, conditioning, and the brain: successes and challenges. Cogn. Affect. Behav. Neurosci. 9, 343–64 (2009)

    Google Scholar 

  56. Hamid, O.H.: The role of temporal statistics in the transfer of experience in context-dependent reinforcement learning. In: 14th International Conference on Hybrid Intelligent Systems (HIS), IEEE, pp. 123–128 (2014)

    Google Scholar 

  57. Dayan, P.: The role of value systems in decision making. In: Engel, C., Singer, W. (eds.) Better than Conscious? Decision Making, the Human Mind, and Implications for Institutions, pp. 50–71. The MIT Press, Frankfurt, Germany (2008)

    Google Scholar 

  58. Packard, M.G., Knowlton, B.: Learning and memory functions of the basal ganglia. Ann. Rev. Neurosci. 25, 563–593 (2002)

    Google Scholar 

  59. Dayan, P., Balleine, B.W.: Reward, motivation, and reinforcement learning. Neuron 36, 285–298 (2002)

    Google Scholar 

  60. Owen, A.M.: Cognitive planning in humans: neuropsychological, neuroanatomical and neuropharmacological perspectives. Prog. Neurobiol. 53, 431–450 (1997)

    Google Scholar 

  61. Rigotti, M., Rubin, D.B.D., Morrison, S.E., Salzman, C.D., Fusi, S.: Attractor concretion as a mechanism for the formation of context representations. Neuroimage 52, 833–847 (2010)

    Google Scholar 

  62. Niv, Y., Daniel, R., Geana, A., Gershman, S.J., Leong, Y.C., Radulescu, A., Wilson, R.C.: Reinforcement learning in multidimensional environments relies on attention mechanisms. J. Neurosci. 35, 8145–8157 (2015)

    Google Scholar 

  63. Kamin, L.J.: Predictability, surprise, attention, and conditioning. In: Campbell, B.A., Church, R.M. (eds.) Punishment and Aversive Behavior, pp. 242–259. Appleton-Century-Crofts, New York (1969)

    Google Scholar 

  64. Reynolds, G.S.: Attention in the pigeon. J. Exp. Anal. Behav. 4, 203–208 (1961)

    Google Scholar 

  65. Rescorla, R.A., Lolordo, V.M.: Inhibition of avoidance behavior. J. Comp. Physiol. Psychol. 59, 406–412 (1968)

    Google Scholar 

  66. Kremer, E.F.: The Rescorla-Wagner model: losses in associative strength in compound conditioned stimuli. J. Exp. Psychol. Animal Behav. Proc. 4, 22–36 (1978)

    Google Scholar 

  67. Dayan, P., Abbott, L.F.: Theoretical Neuroscience: Computational and Mathematical Modeling of Neural Systems. The MIT Press, Cambridge (2005)

    MATH  Google Scholar 

  68. Nevo, I., Erev, I.: On surprise, change, and the effect of recent outcomes. Front. Psychol. 3 (2012)

    Google Scholar 

  69. Poldrack, R.A., Packard, M.G.: Competition among multiple memory systems: converging evidence from animal and human brain studies. Neuropsychologia 41, 245–251 (2003)

    Google Scholar 

  70. Hamid, O.H., Braun, J.: Attractor neural states: a brain-inspired complementary approach to reinforcement learning. In: Proceedings of the 9th International Joint Conference on Computational Intelligence - Volume 1: IJCCI, INSTICC, SciTePress, pp. 385–392 (2017)

    Google Scholar 

  71. Zilli, E.A., Hasselmo, M.E.: Modeling the role of working memory and episodic memory in behavioral tasks. Hippocampus 18, 193–209 (2008)

    Google Scholar 

  72. Penner, M.R., Mizumori, S.J.: Neural systems analysis of decision making during goal-directed navigation. Prog. Neurobiol. 96, 96–135 (2012)

    Google Scholar 

  73. Chumbley, J.R., Flandin, G., Bach, D.R., Daunizeau, J., Fehr, E., Dolan, R.J., Friston, K.J.: Learning and generalization under ambiguity: An fmri study. PLoS Comput. Biol. 8, 1–11 (2012)

    Google Scholar 

  74. Amit, D.J., Fusi, S., Yakovlev, V.: Paradigmatic working memory (attractor) cell in it cortex. Neural Comput. 9, 1071–1092 (1997)

    Google Scholar 

  75. Miyashita, Y., Chang, H.S.: Neuronal correlate of pictorial short-term memory in the primate temporal cortex. Nature 331, 68–70 (1988)

    Google Scholar 

  76. Miyashita, Y.: Neuronal correlate of visual associative long-term memory in the primate temporal cortex. Nature 335, 817–820 (1988)

    Google Scholar 

  77. Yakovlev, V., Fusi, S., Berman, E., Zohary, E.: Inter-trial neuronal activity in inferior temporal cortex: a putative vehicle to generate long-term visual associations. Nat. Neurosci. 1, 310–317 (1998)

    Google Scholar 

  78. Sakai, K., Miyashita, Y.: Neural organization for the long-term memory of paired associates. Nature 354, 152–155 (1991)

    Google Scholar 

  79. Sakai, K., Naya, Y., Miyashita, Y.: Neuronal tuning and associative mechanisms in form representation. Learn. Mem. 1, 83–105 (1994)

    Google Scholar 

  80. Rainer, G., Rao, S.C., Miller, E.K.: Prospective coding for objects in primate prefrontal cortex. J. Neurosci. 19, 5493–5505 (1999)

    Google Scholar 

  81. Amit, D.J.: The Hebbian paradigm reintegrated: local reverberations as internal representations. Behav. Brain Sci. 18, 617–626 (1995)

    Google Scholar 

  82. Griniasty, M., Tsodyks, M.V., Amit, D.J.: Conversion of temporal correlations between stimuli to spatial correlations between attractors. Neural Comput. 5, 1–17 (1993)

    Google Scholar 

  83. Brunel, N.: Hebbian learning of context in recurrent neural networks. Neural Comput. 8, 1677–1710 (1996)

    Google Scholar 

  84. Barbieri, F., Brunel, N.: Can attractor network models account for the statistics of firing rates during persistent activity in prefrontal cortex? Front. Neurosci. 2, 114–122 (2008)

    Google Scholar 

  85. Fusi, S., Drew, P.J., Abbott, L.F.: Cascade models of synaptically stored memories. Neuron 45, 599–611 (2005)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Oussama H. Hamid .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Hamid, O.H., Braun, J. (2019). Reinforcement Learning and Attractor Neural Network Models of Associative Learning. In: Sabourin, C., Merelo, J.J., Madani, K., Warwick, K. (eds) Computational Intelligence. IJCCI 2017. Studies in Computational Intelligence, vol 829. Springer, Cham. https://doi.org/10.1007/978-3-030-16469-0_17

Download citation

Publish with us

Policies and ethics