Abstract
Swarm robotics (SR) is a research field about how to design a large number of robots so that they can generate meaningful collective behaviors. One of the promising approaches in designing a control policy is reinforcement learning (RL). However, it is well known that the sparse reward problem may arise, especially in cases of solving highly complex problems. Curriculum learning (CL) can be one of the effective approaches to overcoming this difficulty. In this paper, we propose a novel method called Self-Teaching Automatic Curriculum Learning (STACL). The training progress of different lessons is compared by agents to determine which lesson should be trained in the next episode. The collective wall-jumping task, in which the robots have to generate collective wall-jumping behavior to jump over the high wall and reach the goal as soon as possible, is employed to illustrate the effects. Simulation results show that the proposed approach has the fastest convergence speed and the most stable performance. In addition, we also conducted experiments to examine the flexibility of the developed controllers.










Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Şahin E (2004) Swarm robotics: from sources of inspiration to domains of application. In: International workshop on swarm robotics. Springer, pp 10–20
Seeley TD, Camazine S, Sneyd J (1991) Collective decision-making in honey bees: how colonies choose among nectar sources. Behav Ecol Sociobiol 28(4):277–290
Bayındır L (2016) A review of swarm robotics tasks. Neurocomputing 172:292–321
Francesca G, Brambilla M, Trianni V, Dorigo M, Birattari M (2012) Analysing an evolved robotic behaviour using a biological model of collegial decision making. In: Ziemke T, Balkenius C, Hallam J (eds) From animals to animats 12. Springer Berlin Heidelberg, Berlin, Heidelberg, pp 381–390
Groß R, Dorigo M (2009) Towards group transport by swarms of robots. Int J Bio-Inspir Comput 1:01
Hiraga M, Yasuda T, Ohkura K (2018) Evolutionary acquisition of autonomous specialization in a path-formation task of a robotic swarm. J Adv Comput Intell Intell Inform 22(5):621–628
Gu S, Holly E, Lillicrap T, Levine S (2017) Deep reinforcement learning for robotic manipulation with asynchronous off-policy updates. In: 2017 IEEE international conference on robotics and automation (ICRA), pp 3389–3396. IEEE
Kober J, Bagnell JA, Peters J (2013) Reinforcement learning in robotics: a survey. Int J Rob Res 32(11):1238–1274
Hüttenrauch M, Adrian S, Neumann G et al (2019) Deep reinforcement learning for swarm systems. J Mach Learn Res 20(54):1–31
Bengio Y, Louradour J, Collobert R, Weston J (2009) Curriculum learning. In: Proceedings of the 26th annual international conference on machine learning, pp 41–48
Wang X, Chen Y, Zhu W (2021) A survey on curriculum learning. IEEE Trans Pattern Anal Mach Intell 44:4555–4576
Chen D, Chen K, Zhang Z, Zhang B (2015) Mechanism of locust air posture adjustment. J Bionic Eng 12(3):418–431
Noh M, Kim S-W, An S, Koh J-S, Cho K-J (2012) Flea-inspired catapult mechanism for miniature jumping robots. IEEE Trans Rob 28(5):1007–1018
Romanishin JW, Gilpin K, Rus D (2013) M-blocks: momentum-driven, magnetic modular robots. In: 2013 IEEE/RSJ International Conference on Intelligent Robots and Systems, pp 4288–4295. IEEE
Mnih V, Kavukcuoglu K, Silver D, Rusu AA, Veness J, Bellemare MG, Graves A, Riedmiller M, Fidjeland AK, Ostrovski G et al (2015) Human-level control through deep reinforcement learning. Nature 518(7540):529–533
Kulkarni TD, Narasimhan K, Saeedi A, Tenenbaum J (2016) Hierarchical deep reinforcement learning: integrating temporal abstraction and intrinsic motivation. Adv Neural Inform Process Syst 29
Duan Y, Chen X, Houthooft R, Schulman J, Abbeel P (2016) Benchmarking deep reinforcement learning for continuous control. In: International conference on machine learning, pp 1329–1338. PMLR
Matiisen T, Oliver A, Cohen T, Schulman J (2017) Teacher-student curriculum learning
Portelas R, Colas C, Weng L, Hofmann K, Oudeyer P-Y (2020) Automatic curriculum learning for deep RL: a short survey. CoRR, abs/2003.04664
Ivanovic B, Harrison J, Sharma A, Chen M, Pavone M (2018) Backward reachability curriculum for robotic reinforcement learning, Barc
Salimans T, Chen R (2018) Learning Montezuma’s revenge from a single demonstration. CoRR
Schulman J, Wolski F, Dhariwal P, Radford A, Klimov O (2017) Proximal policy optimization algorithms. arXiv preprint arXiv:1707.06347
Acknowledgements
This work was supported by Initiative for Realizing Diversity in the Research Environment (Specific Correspondence Type).
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Publisher's Note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
SWARM Special Issue: This work was presented in part at the joint symposium of the 27th International Symposium on Artificial Life and Robotics, the 7th International Symposium on BioComplexity, and the 5th International Symposium on Swarm Behavior and Bio-Inspired Robotics (Online, January 25–27, 2022).
About this article
Cite this article
Nie, X., Liang, Y., Han, Z. et al. Generating collective wall-jumping behavior for a robotic swarm with self-teaching automatic curriculum learning. Artif Life Robotics 28, 67–75 (2023). https://doi.org/10.1007/s10015-022-00833-z
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10015-022-00833-z