Abstract
In this work we study a well-known and challenging problem of Multi-agent Pathfinding, when a set of agents is confined to a graph, each agent is assigned a unique start and goal vertices and the task is to find a set of collision-free paths (one for each agent) such that each agent reaches its respective goal. We investigate how to utilize Monte-Carlo Tree Search (MCTS) to solve the problem. Although MCTS was shown to demonstrate superior performance in a wide range of problems like playing antagonistic games (e.g. Go, Chess etc.), discovering faster matrix multiplication algorithms etc., its application to the problem at hand was not well studied before. To this end we introduce an original variant of MCTS, tailored to multi-agent pathfinding. The crux of our approach is how the reward, that guides MCTS, is computed. Specifically, we use individual paths to assist the agents with the goal-reaching behavior, while leaving them freedom to get off the track if it is needed to avoid collisions. We also use a dedicated decomposition technique to reduce the branching factor of the tree search procedure. Empirically we show that the suggested method outperforms the baseline planning algorithm that invokes heuristic search, e.g. A*, at each re-planning step.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Auer, P., Cesa-Bianchi, N., Fischer, P.: Finite-time analysis of the multiarmed bandit problem. Mach. Learn. 47, 235–256 (2002)
Best, G., Cliff, O.M., Patten, T., Mettu, R.R., Fitch, R.: Dec-MCTS: decentralized planning for multi-robot active perception. Int. J. Robot. Res. 38(2–3), 316–337 (2019)
Browne, C.B., et al.: A survey of Monte Carlo tree search methods. IEEE Trans. Comput. Intell. AI Games 4(1), 1–43 (2012)
Dam, T., Chalvatzaki, G., Peters, J., Pajarinen, J.: Monte-Carlo robot path planning. IEEE Robot. Autom. Lett. 7(4), 11213–11220 (2022)
Fawzi, A., et al.: Discovering faster matrix multiplication algorithms with reinforcement learning. Nature 610(7930), 47–53 (2022)
Ju, C., Luo, Q., Yan, X.: Path planning using an improved A-star algorithm. In: 2020 11th International Conference on Prognostics and System Health Management (PHM-2020 Jinan), pp. 23–26. IEEE (2020)
Kocsis, L., Szepesvári, C.: Bandit based Monte-Carlo planning. In: Fürnkranz, J., Scheffer, T., Spiliopoulou, M. (eds.) ECML 2006. LNCS (LNAI), vol. 4212, pp. 282–293. Springer, Heidelberg (2006). https://doi.org/10.1007/11871842_29
Lample, G., et al.: Hypertree proof search for neural theorem proving. In: Advances in Neural Information Processing Systems, vol. 35, pp. 26337–26349 (2022)
Nawaz, F., Ornik, M.: Multi-agent multi-target path planning in Markov decision processes. arXiv preprint arXiv:2205.15841 (2022)
Noh, D., Lee, W., Kim, H.R., Cho, I.S., Shim, I.B., Baek, S.: Adaptive coverage path planning policy for a cleaning robot with deep reinforcement learning. In: 2022 IEEE International Conference on Consumer Electronics (ICCE), pp. 1–6. IEEE (2022)
Rashid, T., Samvelyan, M., De Witt, C.S., Farquhar, G., Foerster, J., Whiteson, S.: Monotonic value function factorisation for deep multi-agent reinforcement learning. J. Mach. Learn. Res. 21(1), 7234–7284 (2020)
Schrittwieser, J., et al.: Mastering atari, go, chess and shogi by planning with a learned model. Nature 588(7839), 604–609 (2020)
Silver, D., et al.: Mastering the game of go without human knowledge. Nature 550(7676), 354–359 (2017)
Skrynnik, A., Andreychuk, A., Yakovlev, K., Panov, A.: Pathfinding in stochastic environments: learning vs planning. PeerJ Comput. Sci. 8, e1056 (2022). https://doi.org/10.7717/peerj-cs.1056. https://peerj.com/articles/cs-1056
Skrynnik, A., Andreychuk, A., Yakovlev, K., Panov, A.I.: POGEMA: partially observable grid environment for multiple agents. arXiv preprint arXiv:2206.10944 (2022)
Skrynnik, A., Yakovleva, A., Davydov, V., Yakovlev, K., Panov, A.I.: Hybrid policy learning for multi-agent pathfinding. IEEE Access 9, 126034–126047 (2021)
Standley, T.: Finding optimal solutions to cooperative pathfinding problems. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 24, pp. 173–178 (2010)
Stern, R., et al.: Multi-agent pathfinding: definitions, variants, and benchmarks. In: Proceedings of the International Symposium on Combinatorial Search, vol. 10, pp. 151–158 (2019)
Sutton, R.S., Barto, A.G.: Reinforcement Learning: An Introduction. MIT Press, Cambridge (2018)
Yakovlev, K., Andreychuk, A., Skrynnik, A., Panov, A.: Planning and learning in multi-agent path finding. Doklady Math. 106, S79–S84 (2023). https://doi.org/10.1134/S1064562422060229
Ye, W., Liu, S., Kurutach, T., Abbeel, P., Gao, Y.: Mastering atari games with limited data. In: Advances in Neural Information Processing Systems, vol. 34 (2021)
Zerbel, N., Yliniemi, L.: Multiagent Monte Carlo tree search. In: Proceedings of the 18th International Conference on Autonomous Agents and MultiAgent Systems, pp. 2309–2311 (2019)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Pitanov, Y., Skrynnik, A., Andreychuk, A., Yakovlev, K., Panov, A. (2023). Monte-Carlo Tree Search for Multi-agent Pathfinding: Preliminary Results. In: García Bringas, P., et al. Hybrid Artificial Intelligent Systems. HAIS 2023. Lecture Notes in Computer Science(), vol 14001. Springer, Cham. https://doi.org/10.1007/978-3-031-40725-3_55
Download citation
DOI: https://doi.org/10.1007/978-3-031-40725-3_55
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-40724-6
Online ISBN: 978-3-031-40725-3
eBook Packages: Computer ScienceComputer Science (R0)