Skip to main content

Monte-Carlo Tree Search for Multi-agent Pathfinding: Preliminary Results

  • Conference paper
  • First Online:
Hybrid Artificial Intelligent Systems (HAIS 2023)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 14001))

Included in the following conference series:

  • 965 Accesses

Abstract

In this work we study a well-known and challenging problem of Multi-agent Pathfinding, when a set of agents is confined to a graph, each agent is assigned a unique start and goal vertices and the task is to find a set of collision-free paths (one for each agent) such that each agent reaches its respective goal. We investigate how to utilize Monte-Carlo Tree Search (MCTS) to solve the problem. Although MCTS was shown to demonstrate superior performance in a wide range of problems like playing antagonistic games (e.g. Go, Chess etc.), discovering faster matrix multiplication algorithms etc., its application to the problem at hand was not well studied before. To this end we introduce an original variant of MCTS, tailored to multi-agent pathfinding. The crux of our approach is how the reward, that guides MCTS, is computed. Specifically, we use individual paths to assist the agents with the goal-reaching behavior, while leaving them freedom to get off the track if it is needed to avoid collisions. We also use a dedicated decomposition technique to reduce the branching factor of the tree search procedure. Empirically we show that the suggested method outperforms the baseline planning algorithm that invokes heuristic search, e.g. A*, at each re-planning step.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

Notes

  1. 1.

    https://github.com/AIRI-Institute/pogema.

  2. 2.

    https://github.com/deepmind/labmaze.

References

  1. Auer, P., Cesa-Bianchi, N., Fischer, P.: Finite-time analysis of the multiarmed bandit problem. Mach. Learn. 47, 235–256 (2002)

    Article  Google Scholar 

  2. Best, G., Cliff, O.M., Patten, T., Mettu, R.R., Fitch, R.: Dec-MCTS: decentralized planning for multi-robot active perception. Int. J. Robot. Res. 38(2–3), 316–337 (2019)

    Article  Google Scholar 

  3. Browne, C.B., et al.: A survey of Monte Carlo tree search methods. IEEE Trans. Comput. Intell. AI Games 4(1), 1–43 (2012)

    Article  Google Scholar 

  4. Dam, T., Chalvatzaki, G., Peters, J., Pajarinen, J.: Monte-Carlo robot path planning. IEEE Robot. Autom. Lett. 7(4), 11213–11220 (2022)

    Article  Google Scholar 

  5. Fawzi, A., et al.: Discovering faster matrix multiplication algorithms with reinforcement learning. Nature 610(7930), 47–53 (2022)

    Article  Google Scholar 

  6. Ju, C., Luo, Q., Yan, X.: Path planning using an improved A-star algorithm. In: 2020 11th International Conference on Prognostics and System Health Management (PHM-2020 Jinan), pp. 23–26. IEEE (2020)

    Google Scholar 

  7. Kocsis, L., Szepesvári, C.: Bandit based Monte-Carlo planning. In: Fürnkranz, J., Scheffer, T., Spiliopoulou, M. (eds.) ECML 2006. LNCS (LNAI), vol. 4212, pp. 282–293. Springer, Heidelberg (2006). https://doi.org/10.1007/11871842_29

    Chapter  Google Scholar 

  8. Lample, G., et al.: Hypertree proof search for neural theorem proving. In: Advances in Neural Information Processing Systems, vol. 35, pp. 26337–26349 (2022)

    Google Scholar 

  9. Nawaz, F., Ornik, M.: Multi-agent multi-target path planning in Markov decision processes. arXiv preprint arXiv:2205.15841 (2022)

  10. Noh, D., Lee, W., Kim, H.R., Cho, I.S., Shim, I.B., Baek, S.: Adaptive coverage path planning policy for a cleaning robot with deep reinforcement learning. In: 2022 IEEE International Conference on Consumer Electronics (ICCE), pp. 1–6. IEEE (2022)

    Google Scholar 

  11. Rashid, T., Samvelyan, M., De Witt, C.S., Farquhar, G., Foerster, J., Whiteson, S.: Monotonic value function factorisation for deep multi-agent reinforcement learning. J. Mach. Learn. Res. 21(1), 7234–7284 (2020)

    MathSciNet  Google Scholar 

  12. Schrittwieser, J., et al.: Mastering atari, go, chess and shogi by planning with a learned model. Nature 588(7839), 604–609 (2020)

    Article  Google Scholar 

  13. Silver, D., et al.: Mastering the game of go without human knowledge. Nature 550(7676), 354–359 (2017)

    Article  Google Scholar 

  14. Skrynnik, A., Andreychuk, A., Yakovlev, K., Panov, A.: Pathfinding in stochastic environments: learning vs planning. PeerJ Comput. Sci. 8, e1056 (2022). https://doi.org/10.7717/peerj-cs.1056. https://peerj.com/articles/cs-1056

  15. Skrynnik, A., Andreychuk, A., Yakovlev, K., Panov, A.I.: POGEMA: partially observable grid environment for multiple agents. arXiv preprint arXiv:2206.10944 (2022)

  16. Skrynnik, A., Yakovleva, A., Davydov, V., Yakovlev, K., Panov, A.I.: Hybrid policy learning for multi-agent pathfinding. IEEE Access 9, 126034–126047 (2021)

    Article  Google Scholar 

  17. Standley, T.: Finding optimal solutions to cooperative pathfinding problems. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 24, pp. 173–178 (2010)

    Google Scholar 

  18. Stern, R., et al.: Multi-agent pathfinding: definitions, variants, and benchmarks. In: Proceedings of the International Symposium on Combinatorial Search, vol. 10, pp. 151–158 (2019)

    Google Scholar 

  19. Sutton, R.S., Barto, A.G.: Reinforcement Learning: An Introduction. MIT Press, Cambridge (2018)

    Google Scholar 

  20. Yakovlev, K., Andreychuk, A., Skrynnik, A., Panov, A.: Planning and learning in multi-agent path finding. Doklady Math. 106, S79–S84 (2023). https://doi.org/10.1134/S1064562422060229

    Article  Google Scholar 

  21. Ye, W., Liu, S., Kurutach, T., Abbeel, P., Gao, Y.: Mastering atari games with limited data. In: Advances in Neural Information Processing Systems, vol. 34 (2021)

    Google Scholar 

  22. Zerbel, N., Yliniemi, L.: Multiagent Monte Carlo tree search. In: Proceedings of the 18th International Conference on Autonomous Agents and MultiAgent Systems, pp. 2309–2311 (2019)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Alexey Skrynnik .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Pitanov, Y., Skrynnik, A., Andreychuk, A., Yakovlev, K., Panov, A. (2023). Monte-Carlo Tree Search for Multi-agent Pathfinding: Preliminary Results. In: García Bringas, P., et al. Hybrid Artificial Intelligent Systems. HAIS 2023. Lecture Notes in Computer Science(), vol 14001. Springer, Cham. https://doi.org/10.1007/978-3-031-40725-3_55

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-40725-3_55

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-40724-6

  • Online ISBN: 978-3-031-40725-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics