Abstract
One of the challenging aspects of open ended or lifelong agent development is that the final behaviour for which an agent is trained at a given moment can be an element for the future creation of one, or even several, behaviours of greater complexity, whose purpose cannot be anticipated. In this paper, we present modular influence network design (MIND), an artificial agent control architecture suited to open ended and cumulative learning. The MIND architecture encapsulates sub behaviours into modules and combines them into a hierarchy reflecting the modular and hierarchical nature of complex tasks. Compared to similar research, the main original aspect of MIND is the multi layered hierarchy using a generic control signal, the influence, to obtain an efficient global behaviour. This article shows the ability of MIND to learn a curriculum of independent didactic tasks of increasing complexity covering different aspects of a desired behaviour. In so doing we demonstrate the contributions of MIND to open-ended development: encapsulation into modules allows for the preservation and re-usability of all the skills acquired during the curriculum and their focused retraining, the modular structure serves the evolving topology by easing the coordination of new sensors, actuators and heterogeneous learning structures.
Similar content being viewed by others
Notes
Objects can be naturally split into parts and sub-parts, complex features and simple features (Kruger et al. 2013).
JBox2d: www.jbox2d.org.
Videos of the results are available at the following addresses:
Raspberry PI3: www.raspberrypi.org/products/raspberry-pi-3-model-b-plus/.
Grove Pi: www.dexterindustries.com/grovepi/.
OpenCV computer vision library: https://opencv.org/.
Videos of the results are available at the following address: www.lirmm.fr/~suro/videos/clawDemo.mp4; https://hal.archives-ouvertes.fr/hal-02594407.
References
Arkin, R. C., & Balch, T. (1997). Aura: Principles and practice in review. Journal of Experimental & Theoretical Artificial Intelligence, 9(2–3), 175–189.
Barto, A. G., Singh, S., & Chentanez, N. (2004). Intrinsically motivated learning of hierarchical collections of skills. In Proceedings of the 3rd International Conference on Development and Learning (pp. 112–19).
Bengio, Y., Louradour, J., Collobert, R., & Weston, J. (2009). Curriculum learning. In Proceedings of the 26th Annual International Conference on Machine Learning (pp. 41–48). ACM.
Blaes, S., Pogančić, M. V., Zhu, J., & Martius, G. (2019). Control what you can: Intrinsically motivated task-planning agent. In Advances in Neural Information Processing Systems (pp. 12520–12531).
Braitenberg, V. (1986). Vehicles: Experiments in synthetic psychology. Cambridge: MIT press.
Brooks, R. A. (1986). A robust layered control system for a mobile robot. IEEE Journal of Robotics and Automation, 2(1), 14–23.
Connor, J. T., Martin, R. D., & Atlas, L. E. (1994). Recurrent neural networks and robust time series prediction. IEEE Transactions on Neural Networks, 5(2), 240–254.
De Jong, K. A. (1992). Are genetic algorithms function optimizers? PPSN, 2, 3–14.
Devin, C., Gupta, A., Darrell, T., Abbeel, P., & Levine, S. (2017). Learning modular neural network policies for multi-task and multi-robot transfer. In 2017 IEEE international conference on robotics and automation (ICRA) (pp. 2169–2176). IEEE.
Dorigo, M., & Colombetti, M. (1994). Robot shaping: Developing autonomous agents through learning. Artificial intelligence, 71(2), 321–370.
Dorigo, M., & Colombetti, M. (1998). Robot shaping: An experiment in behavior engineering. Cambridge: MIT press.
Felleman, D. J., & Van Essen, D. C. (1991). Distributed hierarchical processing in the primate cerebral cortex. Cerebral Cortex, 1(1), 1–47.
Foglino, F., Christakou, C. C., & Leonetti, M. (2019). An optimization framework for task sequencing in curriculum learning. In Joint IEEE 9th International Conference ICDL-EpiRob (pp. 207–214). IEEE.
Forestier, S., Mollard, Y., & Oudeyer, P.-Y. (2017). Intrinsically motivated goal exploration processes with automatic curriculum learning. arXiv preprint arXiv:1708.02190.
Gen, M., & Lin, L. (2007). Genetic algorithms. In Wiley Encyclopedia of Computer Science and Engineering (pp. 1–15).
Gülçehre, Ç., Moczulski, M., Visin, F., & Bengio, Y. (2016). Mollifying networks. CoRR, abs/1608.04980.
Heess, N., Wayne, G., Tassa, Y., Lillicrap, T. P., Riedmiller, M. A., & Silver, D. (2016). Learning and transfer of modulated locomotor controllers. CoRR, abs/1610.05182.
Hester, T., & Stone, P. (2017). Intrinsically motivated model learning for developing curious robots. Artificial Intelligence, 247, 170–186.
Krizhevsky, A., Sutskever, I., & Hinton, G. E. (2012). Imagenet classification with deep convolutional neural networks. InAdvances in neural information processing systems (pp. 1097–1105).
Kruger, N., Janssen, P., Kalkan, S., Lappe, M., Leonardis, A., Piater, J., et al. (2013). Deep hierarchies in the primate visual cortex: What can we learn for computer vision? IEEE Transactions on Pattern Analysis and Machine Intelligence, 35(8), 1847–1871.
Larsen, T., & Hansen, S. T. (2005). Evolving composite robot behaviour-a modular architecture. In Proceedings of the Fifth International Workshop on Robot Motion and Control, 2005. RoMoCo’05., pages 271–276. IEEE.
Lessin, D., Fussell, D., & Miikkulainen, R. (2013). Open-ended behavioral complexity for evolved virtual creatures. In Proceedings of the 15th annual conference on Genetic and evolutionary computation (pp. 335–342).
Lessin, D., Fussell, D., Miikkulainen, R., & Risi, S. (2015). Increasing behavioral complexity for evolved virtual creatures with the esp method. arXiv preprint arXiv:1510.07957.
Lopes, M., & Oudeyer, P.-Y. (2012). The strategic student approach for life-long exploration and learning. In 2012 IEEE international conference on development and learning and epigenetic robotics (ICDL) (pp. 1–8). IEEE.
Lukoševičius, M., & Jaeger, H. (2009). Reservoir computing approaches to recurrent neural network training. Computer Science Review, 3(3), 127–149.
Lungarella, M., Metta, G., Pfeifer, R., & Sandini, G. (2003). Developmental robotics: A survey. Connection Science, 15(4), 151–190.
Narvekar, S., Sinapov, J., Leonetti, M., & Stone, P. (2016). Source task creation for curriculum learning. In: Proceedings of the 2016 international conference on autonomous agents & multiagent systems (pp. 566–574). International Foundation for Autonomous Agents and Multiagent Systems.
Niël, R., & Wiering, M. A. (2018). Hierarchical reinforcement learning for playing a dynamic dungeon crawler game. In 2018 IEEE Symposium Series on Computational Intelligence (SSCI) (pp. 1159–1166). IEEE.
Oudeyer, P.-Y. (2012). Developmental robotics. In Encyclopedia of the sciences of learning (pp 969–972). Springer.
Oudeyer, P.-Y., & Kaplan, F. (2007). What is intrinsic motivation? a typology of computational approaches. Frontiers in Neurorobotics, 1, 6.
Piaget, J. (1954). The construction of reality in the child. New York: Basic Books.
Piaget, J., & Duckworth, E. (1970). Genetic epistemology. American Behavioral Scientist, 13(3), 459–480.
Reynolds, C. W. (1987). Flocks, herds and schools: A distributed behavioral model. In ACM SIGGRAPH computer graphics (Vol. 21, pp. 25–34). ACM.
Rudolph, G. (1994). Convergence analysis of canonical genetic algorithms. IEEE Transactions on Neural Networks, 5(1), 96–101.
Russell, S., & Norvig, P. (2009). Artificial Intelligence: A Modern Approach (3rd ed.). Upper Saddle River: Prentice Hall Press.
Santucci, V. G., Baldassarre, G., & Cartoni, E. (2019). Autonomous reinforcement learning of multiple interrelated tasks. In 2019 Joint IEEE 9th international conference on development and learning and epigenetic robotics (ICDL-EpiRob) (pp. 221–227). IEEE.
Santucci, V. G., Baldassarre, G., & Mirolli, M. (2016). Grail: A goal-discovering robotic architecture for intrinsically-motivated learning. IEEE Transactions on Cognitive and Developmental Systems, 8(3), 214–231.
Schrum, J., & Miikkulainen, R. (2015). Discovering multimodal behavior in MS PAC-man through evolution of modular neural networks. IEEE Transactions on Computational Intelligence and AI in Games, 8(1), 67–81.
Simonin, O., & Ferber, J. (2000). Modeling self satisfaction and altruism to handle action selection and reactive cooperation. In Proceedings of the 6th international conference on the simulation of adaptive behavior (Vol. 2, pp. 314–323).
Stanley, K. O., & Miikkulainen, R. (2002). Evolving neural networks through augmenting topologies. Evolutionary Computation, 10(2), 99–127.
Stone, P., & Veloso, M. (2000). Layered learning. In European conference on machine learning (pp. 369–381). Springer.
Whiteson, S., Kohl, N., Miikkulainen, R., & Stone, P. (2003). Evolving keepaway soccer players through task decomposition. In Genetic and Evolutionary Computation Conference (pp. 356–368). Springer.
Acknowledgements
We thank our anonymous reviewers for their many constructive comments and suggestions which greatly improve the present article. We also thank Eric Bourreau and Marianne Huchard (LIRMM) and the members of the SMILE team (LIRMM) for their comments that helped improve our initial work. This work has been realized with the support of the High Performance Computing Platform HPC@LR, financed by the Occitanie / Pyrénées-Méditerranée Region, Montpellier Mediterranean Metropole and the University of Montpellier, France.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Appendix
Appendix
The following tables give the settings used in scenario 1. All values are given using the units of the physics engine (Jbox2d). These values were used to obtain the results shown in this article. However, other settings can be used with substantial improvements in convergence time and computing cost.
Rights and permissions
About this article
Cite this article
Suro, F., Ferber, J., Stratulat, T. et al. A hierarchical representation of behaviour supporting open ended development and progressive learning for artificial agents. Auton Robot 45, 245–264 (2021). https://doi.org/10.1007/s10514-020-09960-7
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10514-020-09960-7