Skip to main content
Log in

A Possible Explanation for the Generation of Habit in Navigation: a Striatal Behavioral Learning Model

  • Published:
Cognitive Computation Aims and scope Submit manuscript

Abstract

Efficient behavioral learning is a great challenge to an autonomous mobile robot. During navigation, animals can improve their behavioral learning ability by constantly interacting with the environment and gradually realize efficient navigation. Although habitual behavior in animal navigation is relatively well known, understanding of the brain’s habit-generation mechanism remains limited. In this study, we propose a striatal behavioral learning model, composed of the striosome and a matrix model, to possibly explain the generation of habitual behavior in animal navigation. The model’s bionic mechanism is characterized as follows: (1) in the striosome model, orientation information updates constantly based on the operant conditioning mechanism, leading to the generation of habitual behavior, and (2) a matrix model with an improved ε-greedy algorithm chooses actions by adjusting the utilization of learned habits, balancing the relationship between exploration and exploitation in an agent’s navigation. We test our model in Morris square dry maze tasks. Results indicate the effectiveness of the model in explaining habit-related behavior. Besides, we compare our model with the widely used striatal temporal difference learning model. Results show that our model is more efficient and robust than the contrast model. We can conclude that it can successfully solve navigation tasks with habits while showing key neural characteristics of the striatum, which may be significant to the bionic navigation of robots. The proposed model confirms and builds a relationship among habit generation, the striatum, and operant conditioning, which may help explain the mechanism underlying habit generation in animal navigation.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16
Fig. 17
Fig. 18
Fig. 19
Fig. 20
Fig. 21
Fig. 22
Fig. 23
Fig. 24

Similar content being viewed by others

References

  1. Ertugrul OF, Tagluk ME. A novel machine learning method based on generalized behavioral learning theory. Neural Comput Appl. 2017;28:3921–39.

    Article  Google Scholar 

  2. Schunk DH. Learning theories: an educational perspective. 6th ed. London: Pearson; 2012.

    Google Scholar 

  3. Rast AD, Adams SV, Davidson S, Davies S, Hopkins M, Rowley A, et al. Behavioral learning in a cognitive neuromorphic robot: an integrative approach. IEEE Trans Neural Netw. 2018;29(12):6132–44.

    Article  Google Scholar 

  4. Ye P, Wang T, Wang F. A survey of cognitive architectures in the past 20 years. IEEE Trans Cybern. 2018;48(12):3280–90.

    Article  Google Scholar 

  5. Miller KJ, Shenhav A, Ludvig EA. Habits without values. Psychol Rev. 2019;126(2):292–311.

    Article  Google Scholar 

  6. Dolan RJ, Dayan P. Goals and habits in the brain. Neuron. 2013;80(2):312–25.

    Article  Google Scholar 

  7. Ashby FG, Turner BO, Horvitz JC. Cortical and basal ganglia contributions to habit learning and automaticity. Trends Cogn Sci. 2010;14(5):208–15.

    Article  Google Scholar 

  8. Balleine BW, Delgado MR, Hikosaka O. The role of the dorsal striatum in reward and decision-making. J Neurosci. 2007;27(31):8161–5.

    Article  Google Scholar 

  9. Bornstein AM, Daw ND. Multiplicity of control in the basal ganglia: computational roles of striatal subregions. Curr Opin Neurobiol. 2011;21(3):374–80.

    Article  Google Scholar 

  10. Moussa R, Poucet B, Amalric M, Sargolini F. Contributions of dorsal striatal subregions to spatial alternation behavior. Learn Mem. 2011;18(7):444–51.

    Article  Google Scholar 

  11. Donahue CH, Liu M, Kreitzer AC, Kreitzer AC. Distinct value encoding in striatal direct and indirect pathways during adaptive learning. bioRxiv. 2018;277855. https://doi.org/10.1101/277855.

  12. Baston C, Ursino M. A biologically inspired computational model of basal ganglia in action selection. Comput Intell Neurosci. 2015;187417:1–24.

    Article  Google Scholar 

  13. Odoherty JP, Dayan P, Friston KJ, Critchley H, Dolan RJ. Temporal difference models and reward-related learning in the human brain. Neuron. 2003;38(2):329–37.

    Article  Google Scholar 

  14. Barrera A, Caceres A, Weitzenfeld A, Amaya VR. Comparative experimental studies on spatial memory and learning in rats and robots. J Intell Robot Syst. 2011;63(3):361–97.

    Article  Google Scholar 

  15. Zhao F, Zeng Y, Wang G, Bai J, Xu B. A brain-inspired decision making model based on top-down biasing of prefrontal cortex to basal ganglia and its application in autonomous UAV explorations. Cogn Comput. 2018;10(2):296–306.

    Article  Google Scholar 

  16. Bloem B, Huda R, Sru M, Graybiel AM. Two-photon imaging in mice shows striosomes and matrix have overlapping but differential reinforcement-related responses. eLife. 2017;6:e32353.

  17. Mirolli M, Santucci VG, Baldassarre G. Phasic dopamine as a prediction error of intrinsic and extrinsic reinforcements driving both action acquisition and reward maximization: a simulated robotic study. Neural Netw. 2013;39:40–51.

    Article  Google Scholar 

  18. Seger CA, Spiering BJ. A critical review of habit learning and the basal ganglia. Front Syst Neurosci. 2011;5:1–9.

    Article  Google Scholar 

  19. Yin HH, Knowlton BJ. The role of the basal ganglia in habit formation. Nat Rev Neurosci. 2006;7(6):464–76.

    Article  Google Scholar 

  20. Valente A, Huang KH, Portugues R, Engert F. Ontogeny of classical and operant learning behaviors in zebrafish. Learn Mem. 2012;19(4):170–7.

    Article  Google Scholar 

  21. Song K, Takahashi S, Sakurai Y. Reinforcement schedules differentially affect learning in neuronal operant conditioning in rats. Neurosci Res. 2020;153:62–7.

    Article  Google Scholar 

  22. Odoherty JP, Dayan P, Schultz J, Deichmann R, Friston K, Dolan RJ. Dissociable roles of ventral and dorsal striatum in instrumental conditioning. Science. 2004;304(5669):452–4.

    Article  Google Scholar 

  23. Cyr A, Boukadoum M, Theriault F. Operant conditioning: a minimal components requirement in artificial spiking neurons designed for bio-inspired robot’s controller. Front Neurorobotics. 2014;8:1–13.

    Article  Google Scholar 

  24. Ruan X, Wu X. The skinner automaton: a psychological model formalizing the theory of operant conditioning. Sci China Technol Sci. 2013;56(11):2745–61.

    Article  Google Scholar 

  25. Huang J, Ruan X, Yu N, Fan QW, Li JM, Cai JX. A cognitive model based on neuromodulated plasticity. Comput Intell Neurosci. 2016;4296356:1–15.

    Article  Google Scholar 

  26. Yin H, Ostlund SB, Knowlton B, Balleine BW. The role of the dorsomedial striatum in instrumental conditioning. Eur J Neurosci. 2005;22(2):513–23.

    Article  Google Scholar 

  27. Skinner BF. The behavior of organisms: an experimental analysis. New York: Appleton-Century-Crofts; 1938.

    Google Scholar 

  28. Smaldino PE. Does learning imply a decrease in the entropy of behavior. 2015. https://arxiv.org/abs/1501.04358

  29. Zhang X, Ruan X, Xiao Y, Huang J. Sensorimotor self-learning model based on operant conditioning for two-wheeled robot. J Shanghai Jiao Tong Univ Sci. 2017;22(2):148–55.

    Article  Google Scholar 

  30. Llofriu M, Tejera G, Contreras M, Pelc T, Fellous JM, Weitzenfeld A. Goal-oriented robot navigation learning using a multi-scale space representation. Neural Netw. 2015;72:62–74.

    Article  Google Scholar 

  31. Kwak S, Jung MW. Distinct roles of striatal direct and indirect pathways in value-based decision making. Elife. 2019;8:e46050.

  32. Crittenden JR, Graybiel AM. Basal ganglia disorders associated with imbalances in the striatal striosome and matrix compartments. Front Neuroanat. 2011;5:59.

    Article  Google Scholar 

  33. Amemori S, Amemori K, Yoshida T, Papageorgiou GK, Xu R, Shimazu H, et al. Microstimulation of primate neocortex targeting striosomes induces negative decision-making. 2019;51(3):731–41. https://doi.org/10.1101/668194.

    Article  Google Scholar 

  34. Friedman A, Homma D, Gibb LG, Amemori KI, Rubin SJ, Hood AS, et al. A corticostriatal path targeting striosomes controls decision-making under conflict. Cell. 2015;161(6):1320–33.

    Article  Google Scholar 

  35. Shumilov K, Real MA, Alejandra VC, Rivera A. Selective ablation of striatal striosomes produces the deregulation of dopamine nigrostriatal pathway. PloS One. 2018;13(8):e0203135.

  36. Ruan X, Chen J, Yu N. Thalamic cooperation between the cerebellum and basal ganglia with a new tropism-based action-dependent heuristic dynamic programming method. Neurocomputing. 2012;93:27–40.

    Article  Google Scholar 

  37. Atallah HE, Lopezpaniagua D, Rudy JW, Reilly RC. Separate neural substrates for skill learning and performance in the ventral and dorsal striatum. Nat Neurosci. 2007;10(1):126–31.

    Article  Google Scholar 

  38. Humphries MD, Gurney K. The role of intra-thalamic and thalamocortical circuits in action selection. Netw Comput Neural Syst. 2002;13(1):131–56.

    Article  MATH  Google Scholar 

  39. Sukumar D, Chakravarthy S. A computational neuromotor model of the role of basal ganglia and hippocampus in spatial navigation. In: 2010 International Conference on Artificial Neural Networks (ICANN). 2010. p. 216–221.

  40. Schultz W. Predictive reward signal of dopamine neurons. J Neurophysiol. 1998;80(1):1–27.

    Article  Google Scholar 

  41. Christelle R, Julie L, Laure RR. The cerebellum: a new key structure in the navigation system. Front Neural Circuits. 2013;7(35):1–12.

    Google Scholar 

  42. Zhang T, Zeng Y, Pan R, Shi M, Lu E. Brain-inspired active learning architecture for procedural knowledge understanding based on human-robot interaction. Cogn Comput. 2021;13(2):381–93. https://doi.org/10.1007/s12559-020-09753-1.

    Article  Google Scholar 

  43. Madl T, Chen K, Montaldi D, Trappl R. Computational cognitive models of spatial memory in navigation space: a review. Neural Netw. 2015;65:18–43.

    Article  Google Scholar 

  44. Ramirezpedraza R, Vargas N, Sandoval C, Padilla JLV, Ramos F. A bio-inspired model of behavior considering decision-making and planning, spatial attention and basic motor commands processes. Cogn Syst Res. 2020;59:293–303.

    Article  Google Scholar 

  45. Sukumar D, Rengaswamy M, Chakravarthy VS. Modeling the contributions of basal ganglia and hippocampus to spatial navigation using reinforcement learning. PloS One. 2012;7(10):e47467.

  46. Cazin N, Alonso ML, Chiodi PS, Pelc T, Harland B, Weitzenfeld A, et al. Reservoir computing model of prefrontal cortex creates novel combinations of previous navigation sequences from hippocampal place-cell replay with spatial reward propagation. PLoS Comput Biol. 2019;15(7):1–32.

    Article  Google Scholar 

  47. Wang D, Hu Y, Ma T. Mobile robot navigation with the combination of supervised learning in cerebellum and reward-based learning in basal ganglia. Cogn Syst Res. 2020;59:1–14.

    Article  Google Scholar 

  48. Daw ND, Niv Y, Dayan P. Uncertainty-based competition between prefrontal and dorsolateral striatal systems for behavioral control. Nat Neurosci. 2005;8(12):1704–11.

    Article  Google Scholar 

Download references

Acknowledgements

We thank LetPub (www.letpub.com) for its linguistic assistance during the preparation of this manuscript.

Funding

This study was funded by the National Natural Science Foundation of China (61773027, 62076014).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jing Huang.

Ethics declarations

Ethical Approval

This article does not contain any studies with human participants or animals performed by any of the authors.

Conflict of Interest

The authors declare no competing interests.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Chai, J., Ruan, X. & Huang, J. A Possible Explanation for the Generation of Habit in Navigation: a Striatal Behavioral Learning Model. Cogn Comput 14, 1189–1210 (2022). https://doi.org/10.1007/s12559-021-09950-6

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s12559-021-09950-6

Keywords

Navigation