A hierarchical representation of behaviour supporting open ended development and progressive learning for artificial agents

Suro, François; Ferber, Jacques; Stratulat, Tiberiu; Michel, Fabien

doi:10.1007/s10514-020-09960-7

A hierarchical representation of behaviour supporting open ended development and progressive learning for artificial agents

Original Research
Published: 05 January 2021

Volume 45, pages 245–264, (2021)
Cite this article

Autonomous Robots Aims and scope Submit manuscript

François Suro ORCID: orcid.org/0000-0002-3857-8735¹,
Jacques Ferber¹,
Tiberiu Stratulat¹ &
…
Fabien Michel¹

368 Accesses
1 Citation
Explore all metrics

Abstract

One of the challenging aspects of open ended or lifelong agent development is that the final behaviour for which an agent is trained at a given moment can be an element for the future creation of one, or even several, behaviours of greater complexity, whose purpose cannot be anticipated. In this paper, we present modular influence network design (MIND), an artificial agent control architecture suited to open ended and cumulative learning. The MIND architecture encapsulates sub behaviours into modules and combines them into a hierarchy reflecting the modular and hierarchical nature of complex tasks. Compared to similar research, the main original aspect of MIND is the multi layered hierarchy using a generic control signal, the influence, to obtain an efficient global behaviour. This article shows the ability of MIND to learn a curriculum of independent didactic tasks of increasing complexity covering different aspects of a desired behaviour. In so doing we demonstrate the contributions of MIND to open-ended development: encapsulation into modules allows for the preservation and re-usability of all the skills acquired during the curriculum and their focused retraining, the modular structure serves the evolving topology by easing the coordination of new sensors, actuators and heterogeneous learning structures.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

From Programming Agents to Educating Agents – A Jason-Based Framework for Integrating Learning in the Development of Cognitive Agents

Teachable Agent as an Interactive Tool for Cognitive Task Analysis: A Case Study for Authoring an Expert Model

Article 12 July 2021

The Pedagogical Pentagon: A Conceptual Framework for Artificial Pedagogy

Notes

Objects can be naturally split into parts and sub-parts, complex features and simple features (Kruger et al. 2013).
JBox2d: www.jbox2d.org.
Videos of the results are available at the following addresses:
Raspberry PI3: www.raspberrypi.org/products/raspberry-pi-3-model-b-plus/.
Grove Pi: www.dexterindustries.com/grovepi/.
OpenCV computer vision library: https://opencv.org/.
Videos of the results are available at the following address: www.lirmm.fr/~suro/videos/clawDemo.mp4; https://hal.archives-ouvertes.fr/hal-02594407.

References

Arkin, R. C., & Balch, T. (1997). Aura: Principles and practice in review. Journal of Experimental & Theoretical Artificial Intelligence, 9(2–3), 175–189.
Article Google Scholar
Barto, A. G., Singh, S., & Chentanez, N. (2004). Intrinsically motivated learning of hierarchical collections of skills. In Proceedings of the 3rd International Conference on Development and Learning (pp. 112–19).
Bengio, Y., Louradour, J., Collobert, R., & Weston, J. (2009). Curriculum learning. In Proceedings of the 26th Annual International Conference on Machine Learning (pp. 41–48). ACM.
Blaes, S., Pogančić, M. V., Zhu, J., & Martius, G. (2019). Control what you can: Intrinsically motivated task-planning agent. In Advances in Neural Information Processing Systems (pp. 12520–12531).
Braitenberg, V. (1986). Vehicles: Experiments in synthetic psychology. Cambridge: MIT press.
Google Scholar
Brooks, R. A. (1986). A robust layered control system for a mobile robot. IEEE Journal of Robotics and Automation, 2(1), 14–23.
Article Google Scholar
Connor, J. T., Martin, R. D., & Atlas, L. E. (1994). Recurrent neural networks and robust time series prediction. IEEE Transactions on Neural Networks, 5(2), 240–254.
Article Google Scholar
De Jong, K. A. (1992). Are genetic algorithms function optimizers? PPSN, 2, 3–14.
Google Scholar
Devin, C., Gupta, A., Darrell, T., Abbeel, P., & Levine, S. (2017). Learning modular neural network policies for multi-task and multi-robot transfer. In 2017 IEEE international conference on robotics and automation (ICRA) (pp. 2169–2176). IEEE.
Dorigo, M., & Colombetti, M. (1994). Robot shaping: Developing autonomous agents through learning. Artificial intelligence, 71(2), 321–370.
Article Google Scholar
Dorigo, M., & Colombetti, M. (1998). Robot shaping: An experiment in behavior engineering. Cambridge: MIT press.
Google Scholar
Felleman, D. J., & Van Essen, D. C. (1991). Distributed hierarchical processing in the primate cerebral cortex. Cerebral Cortex, 1(1), 1–47.
Article Google Scholar
Foglino, F., Christakou, C. C., & Leonetti, M. (2019). An optimization framework for task sequencing in curriculum learning. In Joint IEEE 9th International Conference ICDL-EpiRob (pp. 207–214). IEEE.
Forestier, S., Mollard, Y., & Oudeyer, P.-Y. (2017). Intrinsically motivated goal exploration processes with automatic curriculum learning. arXiv preprint arXiv:1708.02190.
Gen, M., & Lin, L. (2007). Genetic algorithms. In Wiley Encyclopedia of Computer Science and Engineering (pp. 1–15).
Gülçehre, Ç., Moczulski, M., Visin, F., & Bengio, Y. (2016). Mollifying networks. CoRR, abs/1608.04980.
Heess, N., Wayne, G., Tassa, Y., Lillicrap, T. P., Riedmiller, M. A., & Silver, D. (2016). Learning and transfer of modulated locomotor controllers. CoRR, abs/1610.05182.
Hester, T., & Stone, P. (2017). Intrinsically motivated model learning for developing curious robots. Artificial Intelligence, 247, 170–186.
Article MathSciNet Google Scholar
Krizhevsky, A., Sutskever, I., & Hinton, G. E. (2012). Imagenet classification with deep convolutional neural networks. InAdvances in neural information processing systems (pp. 1097–1105).
Kruger, N., Janssen, P., Kalkan, S., Lappe, M., Leonardis, A., Piater, J., et al. (2013). Deep hierarchies in the primate visual cortex: What can we learn for computer vision? IEEE Transactions on Pattern Analysis and Machine Intelligence, 35(8), 1847–1871.
Article Google Scholar
Larsen, T., & Hansen, S. T. (2005). Evolving composite robot behaviour-a modular architecture. In Proceedings of the Fifth International Workshop on Robot Motion and Control, 2005. RoMoCo’05., pages 271–276. IEEE.
Lessin, D., Fussell, D., & Miikkulainen, R. (2013). Open-ended behavioral complexity for evolved virtual creatures. In Proceedings of the 15th annual conference on Genetic and evolutionary computation (pp. 335–342).
Lessin, D., Fussell, D., Miikkulainen, R., & Risi, S. (2015). Increasing behavioral complexity for evolved virtual creatures with the esp method. arXiv preprint arXiv:1510.07957.
Lopes, M., & Oudeyer, P.-Y. (2012). The strategic student approach for life-long exploration and learning. In 2012 IEEE international conference on development and learning and epigenetic robotics (ICDL) (pp. 1–8). IEEE.
Lukoševičius, M., & Jaeger, H. (2009). Reservoir computing approaches to recurrent neural network training. Computer Science Review, 3(3), 127–149.
Article Google Scholar
Lungarella, M., Metta, G., Pfeifer, R., & Sandini, G. (2003). Developmental robotics: A survey. Connection Science, 15(4), 151–190.
Article Google Scholar
Narvekar, S., Sinapov, J., Leonetti, M., & Stone, P. (2016). Source task creation for curriculum learning. In: Proceedings of the 2016 international conference on autonomous agents & multiagent systems (pp. 566–574). International Foundation for Autonomous Agents and Multiagent Systems.
Niël, R., & Wiering, M. A. (2018). Hierarchical reinforcement learning for playing a dynamic dungeon crawler game. In 2018 IEEE Symposium Series on Computational Intelligence (SSCI) (pp. 1159–1166). IEEE.
Oudeyer, P.-Y. (2012). Developmental robotics. In Encyclopedia of the sciences of learning (pp 969–972). Springer.
Oudeyer, P.-Y., & Kaplan, F. (2007). What is intrinsic motivation? a typology of computational approaches. Frontiers in Neurorobotics, 1, 6.
Article Google Scholar
Piaget, J. (1954). The construction of reality in the child. New York: Basic Books.
Book Google Scholar
Piaget, J., & Duckworth, E. (1970). Genetic epistemology. American Behavioral Scientist, 13(3), 459–480.
Article Google Scholar
Reynolds, C. W. (1987). Flocks, herds and schools: A distributed behavioral model. In ACM SIGGRAPH computer graphics (Vol. 21, pp. 25–34). ACM.
Rudolph, G. (1994). Convergence analysis of canonical genetic algorithms. IEEE Transactions on Neural Networks, 5(1), 96–101.
Article Google Scholar
Russell, S., & Norvig, P. (2009). Artificial Intelligence: A Modern Approach (3rd ed.). Upper Saddle River: Prentice Hall Press.
MATH Google Scholar
Santucci, V. G., Baldassarre, G., & Cartoni, E. (2019). Autonomous reinforcement learning of multiple interrelated tasks. In 2019 Joint IEEE 9th international conference on development and learning and epigenetic robotics (ICDL-EpiRob) (pp. 221–227). IEEE.
Santucci, V. G., Baldassarre, G., & Mirolli, M. (2016). Grail: A goal-discovering robotic architecture for intrinsically-motivated learning. IEEE Transactions on Cognitive and Developmental Systems, 8(3), 214–231.
Article Google Scholar
Schrum, J., & Miikkulainen, R. (2015). Discovering multimodal behavior in MS PAC-man through evolution of modular neural networks. IEEE Transactions on Computational Intelligence and AI in Games, 8(1), 67–81.
Article Google Scholar
Simonin, O., & Ferber, J. (2000). Modeling self satisfaction and altruism to handle action selection and reactive cooperation. In Proceedings of the 6th international conference on the simulation of adaptive behavior (Vol. 2, pp. 314–323).
Stanley, K. O., & Miikkulainen, R. (2002). Evolving neural networks through augmenting topologies. Evolutionary Computation, 10(2), 99–127.
Article Google Scholar
Stone, P., & Veloso, M. (2000). Layered learning. In European conference on machine learning (pp. 369–381). Springer.
Whiteson, S., Kohl, N., Miikkulainen, R., & Stone, P. (2003). Evolving keepaway soccer players through task decomposition. In Genetic and Evolutionary Computation Conference (pp. 356–368). Springer.

Download references

Acknowledgements

We thank our anonymous reviewers for their many constructive comments and suggestions which greatly improve the present article. We also thank Eric Bourreau and Marianne Huchard (LIRMM) and the members of the SMILE team (LIRMM) for their comments that helped improve our initial work. This work has been realized with the support of the High Performance Computing Platform HPC@LR, financed by the Occitanie / Pyrénées-Méditerranée Region, Montpellier Mediterranean Metropole and the University of Montpellier, France.

Author information

Authors and Affiliations

LIRMM - Laboratoire d’Informatique, Robotique et Microélectronique de Montpellier, Université de Montpellier - CNRS, 161 Rue Ada, 34090, Montpellier, France
François Suro, Jacques Ferber, Tiberiu Stratulat & Fabien Michel

Authors

François Suro
View author publications
You can also search for this author in PubMed Google Scholar
Jacques Ferber
View author publications
You can also search for this author in PubMed Google Scholar
Tiberiu Stratulat
View author publications
You can also search for this author in PubMed Google Scholar
Fabien Michel
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to François Suro.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix

The following tables give the settings used in scenario 1. All values are given using the units of the physics engine (Jbox2d). These values were used to obtain the results shown in this article. However, other settings can be used with substantial improvements in convergence time and computing cost.

Table 4 Setting used in scenario 1. Jbox2d engine

Full size table

Table 5 Exhaustive list of the settings and rewards of the curriculum used in scenario 1

Full size table

Rights and permissions

Reprints and permissions

About this article

Cite this article

Suro, F., Ferber, J., Stratulat, T. et al. A hierarchical representation of behaviour supporting open ended development and progressive learning for artificial agents. Auton Robot 45, 245–264 (2021). https://doi.org/10.1007/s10514-020-09960-7

Download citation

Received: 19 July 2019
Accepted: 02 December 2020
Published: 05 January 2021
Issue Date: February 2021
DOI: https://doi.org/10.1007/s10514-020-09960-7

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A hierarchical representation of behaviour supporting open ended development and progressive learning for artificial agents

Abstract

Access this article

Similar content being viewed by others

From Programming Agents to Educating Agents – A Jason-Based Framework for Integrating Learning in the Development of Cognitive Agents

Teachable Agent as an Interactive Tool for Cognitive Task Analysis: A Case Study for Authoring an Expert Model

The Pedagogical Pentagon: A Conceptual Framework for Artificial Pedagogy

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Appendix

Rights and permissions

About this article

Cite this article

Keywords

Navigation

A hierarchical representation of behaviour supporting open ended development and progressive learning for artificial agents

Abstract

Access this article

Similar content being viewed by others

From Programming Agents to Educating Agents – A Jason-Based Framework for Integrating Learning in the Development of Cognitive Agents

Teachable Agent as an Interactive Tool for Cognitive Task Analysis: A Case Study for Authoring an Expert Model

The Pedagogical Pentagon: A Conceptual Framework for Artificial Pedagogy

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Appendix

Appendix

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation