Abstract
As robots begin to coexist with humans, the need for efficient and safe social robot navigation becomes increasingly pressing. In this paper we investigate how world models can enhance the effectiveness of reinforcement learning in social navigation tasks. We introduce three approaches that leverage predictive world models, which are then benchmarked against state-of-the-art algorithms. For a comprehensive and reliable evaluation, we employed multiple metrics during the training and testing phases. The key novelty of our approach consists in the integration and evaluation of predictive world models within the context of social navigation, as well as in the models themselves. Based on a diverse set of performance metrics, the experimental results provide evidence that predictive world models help improve reinforcement learning techniques for social navigation.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Andrychowicz, O.A.M., Baker, B., Chociej, M., Józefowicz, R., McGrew, B., Pachocki, J., Petron, A., Plappert, M., Powell, G., Ray, A., Schneider, J., Sidor, S., Tobin, J., Welinder, P., Weng, L., Zaremba, W.: Learning dexterous in-hand manipulation. Int. J. Robot. Res. 39(1), 3–20 (2020). https://doi.org/10.1177/0278364919887447
Arulkumaran, K., Deisenroth, M.P., Brundage, M., Bharath, A.A.: Deep reinforcement learning: a brief survey. IEEE Signal Process. Mag. 34(6), 26–38 (2017)
Bachiller, P., Rodriguez-Criado, D., Jorvekar, R.R., Bustos, P., Faria, D.R., Manso, L.J.: A graph neural network to model disruption in human-aware robot navigation. Multimed. Tools Appl. 1–19 (2021). https://doi.org/10.1007/s11042-021-11113-6
Brockman, G., Cheung, V., Pettersson, L., Schneider, J., Schulman, J., Tang, J., Zaremba, W.: Openai gym. arXiv:1606.01540 (2016)
Chen, Y.F., Everett, M., Liu, M., How, J.P.: Socially aware motion planning with deep reinforcement learning. In: 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 1343–1350. IEEE (2017)
Chua, K., Calandra, R., McAllister, R., Levine, S.: Deep reinforcement learning in a handful of trials using probabilistic dynamics models. Adv. Neural Inf. Process. Syst. 31 (2018)
Francis, A., Perez-D’Arpino, C., Li, C., Xia, F., Alahi, A., Alami, R., Bera, A., Biswas, A., Biswas, J., Chandra, R., et al.: Principles and guidelines for evaluating social robot navigation algorithms. arXiv:2306.16740 (2023)
Ha, D., Schmidhuber, J.: World models. arXiv:1803.10122 (2018)
Hafner, D., Lillicrap, T., Ba, J., Norouzi, M.: Dream to control: learning behaviors by latent imagination. arXiv:1912.01603 (2019)
Hafner, D., Lillicrap, T., Norouzi, M., Ba, J.: Mastering Atari with discrete world models. arXiv:2010.02193 (2020)
Han, X.: A mathematical introduction to reinforcement learning. Semantic Scholar pp. 1–4 (2018)
Hansen, N.: The CMA evolution strategy: a comparing review. In: Towards a New Evolutionary Computation: Advances in the Estimation of Distribution Algorithms, pp. 75–102 (2006)
Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997)
Kapoor, A., Swamy, S., Manso, L., Bachiller, P.: Socnavgym: a reinforcement learning gym for social navigation. arXiv:2304.14102 (2023)
Kempka, M., Wydmuch, M., Runc, G., Toczek, J., Jaśkowski, W.: Vizdoom: a doom-based AI research platform for visual reinforcement learning. In: 2016 IEEE Conference on Computational Intelligence and Games (CIG), pp. 1–8. IEEE (2016)
Kingma, D.P., Welling, M.: Auto-encoding variational Bayes. arXiv:1312.6114 (2013)
Makoviychuk, V., Wawrzyniak, L., Guo, Y., Lu, M., Storey, K., Macklin, M., Hoeller, D., Rudin, N., Allshire, A., Handa, A., et al.: Isaac gym: high performance GPU-based physics simulation for robot learning. arXiv:2108.10470 (2021)
Matsuo, Y., LeCun, Y., Sahani, M., Precup, D., Silver, D., Sugiyama, M., Uchibe, E., Morimoto, J.: Deep learning, reinforcement learning, and world models. Neural Netw. 152, 267–275 (2022). https://doi.org/10.1016/j.neunet.2022.03.037
Mavrogiannis, C., Baldini, F., Wang, A., Zhao, D., Trautman, P., Steinfeld, A., Oh, J.: Core challenges of social robot navigation: a survey. ACM Trans. Human-Robot Interact. 12(3), 1–39 (2023)
Rao, K., Harris, C., Irpan, A., Levine, S., Ibarz, J., Khansari, M.: RL-CycleGan: reinforcement learning aware simulation-to-real. In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp. 11154–11163 (2020). https://doi.org/10.1109/CVPR42600.2020.01117
Rusu, A.A., Večerík, M., Rothörl, T., Heess, N., Pascanu, R., Hadsell, R.: Sim-to-real robot learning from pixels with progressive nets. In: Conference on Robot Learning, pp. 262–270. PMLR (2017)
Schrittwieser, J., Antonoglou, I., Hubert, T., Simonyan, K., Sifre, L., Schmitt, S., Guez, A., Lockhart, E., Hassabis, D., Graepel, T., Lillicrap, T., Silver, D.: Mastering Atari, Go, chess and shogi by planning with a learned model. Nature 588(7839), 604–609 (2020). https://doi.org/10.1038/s41586-020-03051-4
Siekmann, J., Green, K., Warila, J., Fern, A., Hurst, J.: Blind Bipedal stair traversal via sim-to-real reinforcement learning. Robot. Sci. Syst. (2021). https://doi.org/10.15607/RSS.2021.XVII.061
Silver, D., Huang, A., Maddison, C.J., Guez, A., Sifre, L., Van Den Driessche, G., Schrittwieser, J., Antonoglou, I., Panneershelvam, V., Lanctot, M., et al.: Mastering the game of go with deep neural networks and tree search. Nature 529(7587), 484–489 (2016)
Stathakis, D.: How many hidden layers and nodes? Int. J. Remote Sens. 30(8), 2133–2147 (2009)
Sutton, R.S., Barto, A.G.: Reinforcement learning: an introduction. Robotica 17(2), 229–235 (1999)
Wang, X., Wang, S., Liang, X., Zhao, D., Huang, J., Xu, X., Dai, B., Miao, Q.: Deep reinforcement learning: a survey. IEEE Trans. Neural Netw. Learn. Syst. 1–15 (2022). https://doi.org/10.1109/TNNLS.2022.3207346
Wang, Z., Schaul, T., Hessel, M., Hasselt, H., Lanctot, M., Freitas, N.: Dueling network architectures for deep reinforcement learning. In: International Conference on Machine Learning, pp. 1995–2003. PMLR (2016)
Yu, P.L.: Cone convexity, cone extreme points, and nondominated solutions in decision problems with multiobjectives. J. Optim. Theory Appl. 14, 319–377 (1974)
Yu, T., Kumar, A., Rafailov, R., Rajeswaran, A., Levine, S., Finn, C.: COMBO: conservative offline model-based policy optimization. Adv. Neural Inf. Process. Syst. (NeurIPS) 35, 28954–28967 (2021)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2024 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Oguzie, G., Ekart, A., Manso, L.J. (2024). Predictive World Models for Social Navigation. In: Naik, N., Jenkins, P., Grace, P., Yang, L., Prajapat, S. (eds) Advances in Computational Intelligence Systems. UKCI 2023. Advances in Intelligent Systems and Computing, vol 1453. Springer, Cham. https://doi.org/10.1007/978-3-031-47508-5_5
Download citation
DOI: https://doi.org/10.1007/978-3-031-47508-5_5
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-47507-8
Online ISBN: 978-3-031-47508-5
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)