Balancing Exploration and Exploitation: A Neurally Inspired Mechanism to Learn Sensorimotor Contingencies

Houbre, Quentin; Angleraud, Alexandre; Pieters, Roel

doi:10.1007/978-3-030-71356-0_5

Quentin Houbre¹⁴,
Alexandre Angleraud¹⁴ &
Roel Pieters¹⁴

Part of the book series: Springer Proceedings in Advanced Robotics ((SPAR,volume 18))

Included in the following conference series:

International Workshop on Human-Friendly Robotics

317 Accesses

Abstract

The learning of sensorimotor contingencies is essential for the development of early cognition. Here, we investigate how such process takes place on a neural level. We propose a theoretical concept for learning sensorimotor contingencies based on motor babbling with a robotic arm and dynamic neural fields. The robot learns to perform sequences of motor commands in order to perceive visual activation from a baby mobile toy. First, the robot explores the different sensorimotor outcomes, then autonomously decides to utilize (or not) the experience already gathered. Moreover, we introduce a neural mechanism inspired by recent neuroscience research that supports the switch between exploration and exploitation. The complete model relies on dynamic field theory, which consists of a set of interconnected dynamical systems. In time, the robot demonstrates a behavior toward the exploitation of previously learned sensorimotor contingencies and thus selecting actions that induce high visual activation.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 149.00; Price excludes VAT (USA)

Softcover Book: USD 199.99; Price excludes VAT (USA)

Hardcover Book: USD 199.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Tekülve, J., Fois, A., Sandamirskaya, Y., Schöner, G.: Autonomous sequence generation for a neural dynamic robot: scene perception, serial order, and object-oriented movement. Front. Neurorobotics 13, 95 (2019)
Article Google Scholar
Cangelosi, A., Schlesinger, M.: Developmental Robotics: From Babies to Robots. The MIT Press, Cambridge (2014)
Google Scholar
O’Regan, J.K., Noë, A.: A sensorimotor account of vision and visual consciousness. Behav. Brain Sci. 24(5), 939–973 (2001)
Google Scholar
Piaget, J., Cook, M.: The Origins of Intelligence in Children, vol. 8. International Universities Press, New York (1952)
Book Google Scholar
Demiris, Y., Dearden, A.: From motor babbling to hierarchical learning by imitation: a robot developmental pathway. In: International Workshop on Epigenetic Robotics: Modeling Cognitive Development in Robotic Systems. vol. 123, pp. 31–37 (2005)
Google Scholar
Mahoor, Z., MacLennan, B.J., McBride, A.C.: Neurally plausible motor babbling in robot reaching. In: Joint IEEE International Conference on Development and Learning and Epigenetic Robotics (ICDL-EpiRob), pp. 9–14 (2016)
Google Scholar
Lanillos, P., Dean-Leon, E., Cheng, G.: Yielding self-perception in robots through sensorimotor contingencies. IEEE Trans. Cogn. Dev. Syst. 9(2), 100–112 (2016)
Article Google Scholar
Houbre, Q., Angleraud, A., Pieters, R.: Exploration and exploitation of sensorimotor contingencies for a cognitive embodied agent. In: ICAART (2), pp. 546–554 (2020)
Google Scholar
Berger-Tal, O., Nathan, J., Meron, E., Saltz, D.: The exploration-exploitation dilemma: a multidisciplinary framework. PLoS One 9(4), e95693 (2014)
Google Scholar
Sutton, R.S., Barto, A.G.: Introduction to Reinforcement Learning, vol. 135. MIT Press, Cambridge (1998)
Google Scholar
Chernova, S., Veloso, M.: Interactive policy learning through confidence-based autonomy. J. Artif. Intell. Res. 34, 1–25 (2009)
Article MathSciNet Google Scholar
Maye, A., Engel, A.K.: A discrete computational model of sensorimotor contingencies for object perception and control of behavior. In: 2011 IEEE International Conference on Robotics and Automation, pp. 3810–3815. IEEE (2011)
Google Scholar
Cohen, J.D., McClure, S.M., Yu, A.J.: Should I stay or should I go? how the human brain manages the trade-off between exploitation and exploration. Philos. Trans. R. Soc. B Biol. Sci. 362(1481), 933–942 (2007)
Article Google Scholar
Humphries, M., Khamassi, M., Gurney, K.: Dopaminergic control of the exploration-exploitation trade-off via the basal ganglia. Front. Neurosci. 6, 9 (2012)
Article Google Scholar
Schöner, G., Spencer, J., Group, D.F.T.R.: Dynamic Thinking: A Primer on Dynamic Field Theory. Oxford University Press, Oxford (2016)
Google Scholar
Cannon, W.B.: Organization for physiological homeostasis. Physiol. Rev. 9(3), 399–431 (1929)
Article Google Scholar
Perone, S., Spencer, J.P.: Autonomy in action: linking the act of looking to memory formation in infancy via dynamic neural fields. Cogn. Sci. 37(1), 1–60 (2013)
Article Google Scholar
Sandamirskaya, Y., Schöner, G.: Serial order in an acting system: a multidimensional dynamic neural fields implementation. In: 2010 IEEE 9th International Conference on Development and Learning, pp. 251–256 (2010)
Google Scholar
Kazerounian, S., Luciw, M., Richter, M., Sandamirskaya, Y.: Autonomous reinforcement of behavioral sequences in neural dynamics. In: The 2013 International Joint Conference on Neural Networks (IJCNN), pp. 1–8 (2013)
Google Scholar
Stoelen, M.F., Bonsignorio, F., Cangelosi, A.: Co-exploring actuator antagonism and bio-inspired control in a printable robot arm. In: From Animals to Animats 14, pp. 244–255. Springer International Publishing, Cham (2016)
Google Scholar
Spencer, J.P., Perone, S., Johnson, J.S.: The dynamic field theory and embodied cognitive dynamics. In: Toward a Unified Theory of Development: Connectionism and Dynamic Systems Theory Re-considered, pp. 86–118 (2009)
Google Scholar
Amari, S.I.: Dynamics of pattern formation in lateral-inhibition type neural fields. Biol. Cybern. 27(2), 77–87 (1977)
Article MathSciNet Google Scholar
Posner, M.I., Rafal, R.D., Choate, L.S., Vaughan, J.: Inhibition of return: neural basis and function. Cogn. Neuropsychol. 2(3), 211–228 (1985)
Article Google Scholar
Tipper, S.P., Driver, J., Weaver, B.: Short report: object-centred inhibition of return of visual attention. Q. J. Exp. Psychol. Sect. A 43(2), 289–298 (1991)
Article Google Scholar
Bar-Gad, I., Morris, G., Bergman, H.: Information processing, dimensionality reduction and reinforcement learning in the basal ganglia. Prog. Neurobiol. 71(6), 439–473 (2003)
Article Google Scholar
Netzev, M., Houbre, Q., Airaksinen, E., Angleraud, A., Pieters, R.: Many faced robot-design and manufacturing of a parametric, modular and open source robot head. In: 2019 16th International Conference on Ubiquitous Robots (UR), pp. 102–105. IEEE (2019)
Google Scholar
Lomp, O., Richter, M., Zibner, S.K.U., Schöner, G.: Developing dynamic field theory architectures for embodied cognitive systems with cedar. Front. Neurorobotics 10, 14 (2016)
Article Google Scholar
Schöner, G., Tekülve, J., Zibner, S.: Reaching for objects: a neural process account in a developmental perspective. In: Corbetta, D., Santello, M. (eds.) Reach-to-Grasp Behavior. Routledge, New York (2019)
Google Scholar
Park, J., Kim, D., Nagai, Y.: Learning for goal-directed actions using RNNPB: developmental change of “what to imitate”. IEEE Trans. Cogn. Dev. Syst. 10(3), 545–556 (2018)
Article Google Scholar
Mahé, S., Braud, R., Gaussier, P., Quoy, M., Pitti, A.: Exploiting the gain-modulation mechanism in parieto-motor neurons: application to visuomotor transformations and embodied simulation. Neural Netw. 62, 102–111 (2015)
Article Google Scholar
Johnson, J.S., Spencer, J.P., Luck, S.J., Schöner, G.: A dynamic neural field model of visual working memory and change detection. Psychol. Sci. 20(5), 568–577 (2009)
Article Google Scholar
Cuijpers, R.H., Erlhagen, W.: Implementing bayes’ rule with neural fields. In: International Conference on Artificial Neural Networks, pp. 228–237. Springer, Heidelberg (2008)
Google Scholar
Gepperth, A., Lefort, M.: Latency-based probabilistic information processing in recurrent neural hierarchies. In: International Conference on Artificial Neural Networks, pp. 715–722. Springer, Cham (2014)
Google Scholar

Download references

Author information

Authors and Affiliations

Tampere University, Tampere, Finland
Quentin Houbre, Alexandre Angleraud & Roel Pieters

Authors

Quentin Houbre
View author publications
You can also search for this author in PubMed Google Scholar
Alexandre Angleraud
View author publications
You can also search for this author in PubMed Google Scholar
Roel Pieters
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Quentin Houbre .

Editor information

Editors and Affiliations

Department of Computer Science and Digital Science Center, University of Innsbruck, Innsbruck, Austria
Matteo Saveriano
Department of Computer Science, University of Innsbruck, Innsbruck, Tirol, Austria
Erwan Renaudo
Department of Computer Science, University of Innsbruck, Innsbruck, Austria
Antonio Rodríguez-Sánchez
Department of Computer Science and Digital Science Center, University of Innsbruck, Innsbruck, Austria
Justus Piater

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Houbre, Q., Angleraud, A., Pieters, R. (2021). Balancing Exploration and Exploitation: A Neurally Inspired Mechanism to Learn Sensorimotor Contingencies. In: Saveriano, M., Renaudo, E., Rodríguez-Sánchez, A., Piater, J. (eds) Human-Friendly Robotics 2020. HFR 2020. Springer Proceedings in Advanced Robotics, vol 18. Springer, Cham. https://doi.org/10.1007/978-3-030-71356-0_5

Download citation

DOI: https://doi.org/10.1007/978-3-030-71356-0_5
Published: 07 March 2021
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-71355-3
Online ISBN: 978-3-030-71356-0
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)

Publish with us

Policies and ethics