Skip to main content

Balancing Exploration and Exploitation: A Neurally Inspired Mechanism to Learn Sensorimotor Contingencies

  • Conference paper
  • First Online:
Human-Friendly Robotics 2020 (HFR 2020)

Part of the book series: Springer Proceedings in Advanced Robotics ((SPAR,volume 18))

Included in the following conference series:

  • 317 Accesses

Abstract

The learning of sensorimotor contingencies is essential for the development of early cognition. Here, we investigate how such process takes place on a neural level. We propose a theoretical concept for learning sensorimotor contingencies based on motor babbling with a robotic arm and dynamic neural fields. The robot learns to perform sequences of motor commands in order to perceive visual activation from a baby mobile toy. First, the robot explores the different sensorimotor outcomes, then autonomously decides to utilize (or not) the experience already gathered. Moreover, we introduce a neural mechanism inspired by recent neuroscience research that supports the switch between exploration and exploitation. The complete model relies on dynamic field theory, which consists of a set of interconnected dynamical systems. In time, the robot demonstrates a behavior toward the exploitation of previously learned sensorimotor contingencies and thus selecting actions that induce high visual activation.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 149.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 199.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 199.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Tekülve, J., Fois, A., Sandamirskaya, Y., Schöner, G.: Autonomous sequence generation for a neural dynamic robot: scene perception, serial order, and object-oriented movement. Front. Neurorobotics 13, 95 (2019)

    Article  Google Scholar 

  2. Cangelosi, A., Schlesinger, M.: Developmental Robotics: From Babies to Robots. The MIT Press, Cambridge (2014)

    Google Scholar 

  3. O’Regan, J.K., Noë, A.: A sensorimotor account of vision and visual consciousness. Behav. Brain Sci. 24(5), 939–973 (2001)

    Google Scholar 

  4. Piaget, J., Cook, M.: The Origins of Intelligence in Children, vol. 8. International Universities Press, New York (1952)

    Book  Google Scholar 

  5. Demiris, Y., Dearden, A.: From motor babbling to hierarchical learning by imitation: a robot developmental pathway. In: International Workshop on Epigenetic Robotics: Modeling Cognitive Development in Robotic Systems. vol. 123, pp. 31–37 (2005)

    Google Scholar 

  6. Mahoor, Z., MacLennan, B.J., McBride, A.C.: Neurally plausible motor babbling in robot reaching. In: Joint IEEE International Conference on Development and Learning and Epigenetic Robotics (ICDL-EpiRob), pp. 9–14 (2016)

    Google Scholar 

  7. Lanillos, P., Dean-Leon, E., Cheng, G.: Yielding self-perception in robots through sensorimotor contingencies. IEEE Trans. Cogn. Dev. Syst. 9(2), 100–112 (2016)

    Article  Google Scholar 

  8. Houbre, Q., Angleraud, A., Pieters, R.: Exploration and exploitation of sensorimotor contingencies for a cognitive embodied agent. In: ICAART (2), pp. 546–554 (2020)

    Google Scholar 

  9. Berger-Tal, O., Nathan, J., Meron, E., Saltz, D.: The exploration-exploitation dilemma: a multidisciplinary framework. PLoS One 9(4), e95693 (2014)

    Google Scholar 

  10. Sutton, R.S., Barto, A.G.: Introduction to Reinforcement Learning, vol. 135. MIT Press, Cambridge (1998)

    Google Scholar 

  11. Chernova, S., Veloso, M.: Interactive policy learning through confidence-based autonomy. J. Artif. Intell. Res. 34, 1–25 (2009)

    Article  MathSciNet  Google Scholar 

  12. Maye, A., Engel, A.K.: A discrete computational model of sensorimotor contingencies for object perception and control of behavior. In: 2011 IEEE International Conference on Robotics and Automation, pp. 3810–3815. IEEE (2011)

    Google Scholar 

  13. Cohen, J.D., McClure, S.M., Yu, A.J.: Should I stay or should I go? how the human brain manages the trade-off between exploitation and exploration. Philos. Trans. R. Soc. B Biol. Sci. 362(1481), 933–942 (2007)

    Article  Google Scholar 

  14. Humphries, M., Khamassi, M., Gurney, K.: Dopaminergic control of the exploration-exploitation trade-off via the basal ganglia. Front. Neurosci. 6, 9 (2012)

    Article  Google Scholar 

  15. Schöner, G., Spencer, J., Group, D.F.T.R.: Dynamic Thinking: A Primer on Dynamic Field Theory. Oxford University Press, Oxford (2016)

    Google Scholar 

  16. Cannon, W.B.: Organization for physiological homeostasis. Physiol. Rev. 9(3), 399–431 (1929)

    Article  Google Scholar 

  17. Perone, S., Spencer, J.P.: Autonomy in action: linking the act of looking to memory formation in infancy via dynamic neural fields. Cogn. Sci. 37(1), 1–60 (2013)

    Article  Google Scholar 

  18. Sandamirskaya, Y., Schöner, G.: Serial order in an acting system: a multidimensional dynamic neural fields implementation. In: 2010 IEEE 9th International Conference on Development and Learning, pp. 251–256 (2010)

    Google Scholar 

  19. Kazerounian, S., Luciw, M., Richter, M., Sandamirskaya, Y.: Autonomous reinforcement of behavioral sequences in neural dynamics. In: The 2013 International Joint Conference on Neural Networks (IJCNN), pp. 1–8 (2013)

    Google Scholar 

  20. Stoelen, M.F., Bonsignorio, F., Cangelosi, A.: Co-exploring actuator antagonism and bio-inspired control in a printable robot arm. In: From Animals to Animats 14, pp. 244–255. Springer International Publishing, Cham (2016)

    Google Scholar 

  21. Spencer, J.P., Perone, S., Johnson, J.S.: The dynamic field theory and embodied cognitive dynamics. In: Toward a Unified Theory of Development: Connectionism and Dynamic Systems Theory Re-considered, pp. 86–118 (2009)

    Google Scholar 

  22. Amari, S.I.: Dynamics of pattern formation in lateral-inhibition type neural fields. Biol. Cybern. 27(2), 77–87 (1977)

    Article  MathSciNet  Google Scholar 

  23. Posner, M.I., Rafal, R.D., Choate, L.S., Vaughan, J.: Inhibition of return: neural basis and function. Cogn. Neuropsychol. 2(3), 211–228 (1985)

    Article  Google Scholar 

  24. Tipper, S.P., Driver, J., Weaver, B.: Short report: object-centred inhibition of return of visual attention. Q. J. Exp. Psychol. Sect. A 43(2), 289–298 (1991)

    Article  Google Scholar 

  25. Bar-Gad, I., Morris, G., Bergman, H.: Information processing, dimensionality reduction and reinforcement learning in the basal ganglia. Prog. Neurobiol. 71(6), 439–473 (2003)

    Article  Google Scholar 

  26. Netzev, M., Houbre, Q., Airaksinen, E., Angleraud, A., Pieters, R.: Many faced robot-design and manufacturing of a parametric, modular and open source robot head. In: 2019 16th International Conference on Ubiquitous Robots (UR), pp. 102–105. IEEE (2019)

    Google Scholar 

  27. Lomp, O., Richter, M., Zibner, S.K.U., Schöner, G.: Developing dynamic field theory architectures for embodied cognitive systems with cedar. Front. Neurorobotics 10, 14 (2016)

    Article  Google Scholar 

  28. Schöner, G., Tekülve, J., Zibner, S.: Reaching for objects: a neural process account in a developmental perspective. In: Corbetta, D., Santello, M. (eds.) Reach-to-Grasp Behavior. Routledge, New York (2019)

    Google Scholar 

  29. Park, J., Kim, D., Nagai, Y.: Learning for goal-directed actions using RNNPB: developmental change of “what to imitate”. IEEE Trans. Cogn. Dev. Syst. 10(3), 545–556 (2018)

    Article  Google Scholar 

  30. Mahé, S., Braud, R., Gaussier, P., Quoy, M., Pitti, A.: Exploiting the gain-modulation mechanism in parieto-motor neurons: application to visuomotor transformations and embodied simulation. Neural Netw. 62, 102–111 (2015)

    Article  Google Scholar 

  31. Johnson, J.S., Spencer, J.P., Luck, S.J., Schöner, G.: A dynamic neural field model of visual working memory and change detection. Psychol. Sci. 20(5), 568–577 (2009)

    Article  Google Scholar 

  32. Cuijpers, R.H., Erlhagen, W.: Implementing bayes’ rule with neural fields. In: International Conference on Artificial Neural Networks, pp. 228–237. Springer, Heidelberg (2008)

    Google Scholar 

  33. Gepperth, A., Lefort, M.: Latency-based probabilistic information processing in recurrent neural hierarchies. In: International Conference on Artificial Neural Networks, pp. 715–722. Springer, Cham (2014)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Quentin Houbre .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2021 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Houbre, Q., Angleraud, A., Pieters, R. (2021). Balancing Exploration and Exploitation: A Neurally Inspired Mechanism to Learn Sensorimotor Contingencies. In: Saveriano, M., Renaudo, E., Rodríguez-Sánchez, A., Piater, J. (eds) Human-Friendly Robotics 2020. HFR 2020. Springer Proceedings in Advanced Robotics, vol 18. Springer, Cham. https://doi.org/10.1007/978-3-030-71356-0_5

Download citation

Publish with us

Policies and ethics