A fully distributed multi-robot navigation method without pre-allocating target positions

Zhang, Jingtao; Xu, Zhipeng; Yu, Fangchao; Tang, Qirong

doi:10.1007/s10514-021-09981-w

A fully distributed multi-robot navigation method without pre-allocating target positions

Published: 10 April 2021

Volume 45, pages 473–492, (2021)
Cite this article

Autonomous Robots Aims and scope Submit manuscript

Jingtao Zhang¹,
Zhipeng Xu¹,
Fangchao Yu¹ &
…
Qirong Tang¹

826 Accesses
3 Citations
1 Altmetric
Explore all metrics

Abstract

This study focuses on the multi-robot navigation problem with unpredictable state transition disturbance. The primary goal is to construct a fully distributed multi-robot navigation method without pre-allocating target positions. To this aim, a reinforcement learning based method is presented, in which a distribution of state transition module is proposed to guarantee adaptiveness when trained policies are applied in physical multi-robot systems. The method incorporates a centralized training but fully distributed execution framework. The former can eliminate non-stationarity of the environment, and the latter enables the robots to collaboratively handle partially observable scenarios. Mean while, the designed reward function can guide the robots to approach not pre-allocated target positions and the nearly optimal trajectories are achieved in continuous environment. After training, the robots make decisions independently, coordinate, and cooperate with each other to determine the next actions from their current positions before arriving in target positions without pre-allocation, in which the trajectories are nearly optimal with partial observation available for each robot. Simulations are performed with increasingly complex environments, such as the addition of static obstacles and randomly moving obstacles. The results show that the robots are able to achieve the primary goal with different state transition disturbance, which demonstrates the feasibility, effectiveness, and robustness. Furthermore, experiments are carried out using our multi-robot system corresponding to the simulation. The experimental results demonstrate the effectiveness and robustness of the proposed navigation method to handle a variety of typical robotic scenarios.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Multi-Agent Reinforcement Learning: A Selective Overview of Theories and Algorithms

Path Planning and Trajectory Planning Algorithms: A General Overview

Game-theoretic multi-agent motion planning in a mixed environment

Article 15 March 2024

References

Albani, D., IJsselmuiden, J., Haken, R., & Trianni, V. (2017). Monitoring and mapping with robot swarms for agricultural applications. In 14th IEEE international conference on advanced video and signal based surveillance (AVSS), Lecce, Italy (pp. 1–6).
Al-Jarrah, R., Shahzad, A., & Roth, H. (2015). Path planning and motion coordination for multi-robots system using probabilistic neuro-fuzzy. IFAC-PapersOnLine, 48(10), 46.
Article Google Scholar
Anderson, P., Chang, A., Chaplot, D.S., Dosovitskiy, A., Gupta, S., Koltun, V., Kosecka, J., Malik, J., Mottaghi, R., Savva, M., & Zamir, A. R. (2018). On Evaluation of Embodied Navigation Agents. arXiv:1807.06757
Bareiss, D., & van den Berg, J. (2015). Generalized reciprocal collision avoidance. The International Journal of Robotics Research, 34(12), 1501.
Article Google Scholar
Bewley, A., Rigley, J., Liu, Y., Hawke, J., Shen, R., Lam, V., & Kendall, A. (2019). Learning to drive from simulation without real world labels. In 2019 International conference on robotics and automation (ICRA) (pp. 4818–4824). https://doi.org/10.1109/ICRA.2019.8793668
Bien, Z., & Lee, J. (1992). A minimum-time trajectory planning method for two robots. IEEE Transactions on Robotics and Automation, 8(3), 414.
Article Google Scholar
Canny, J. (1988). The complexity of robot motion planning. MIT Press.
Castillo, O., Trujillo, L., & Melin, P. (2007). Multiple objective genetic algorithms for path-planning optimization in autonomous mobile robots. Soft Computing, 11(3), 269.
Article Google Scholar
Chen, Y. F., Everett, M., Liu, M., & How, J. P. (2017a). Socially aware motion planning with deep reinforcement learning. In IEEE/RSJ international conference on intelligent robots and systems (IROS), Vancouver, BC, Canada (pp. 1343–1350).
Chen, Y. F., Liu, M., Everett, M., & How, J. P. (2017b). Decentralized non-communicating multiagent collision avoidance with deep reinforcement learning. In IEEE international conference on robotics and automation (ICRA), Singapore (pp. 285–292).
Das, P. K., Behera, H. S., Das, S., Tripathy, H. K., Panigrahi, B. K., & Pradhan, S. (2016). A hybrid improved PSO-DV algorithm for multi-robot path planning in a clutter environment. Neurocomputing, 207(26), 735.
Article Google Scholar
Deisenroth, M. P., Neumann, G., & Peters, J. (2013). A survey on policy search for robotics.
Dijkstra, E. W. (1959). A note on two problems in connexion with graphs. Numerische Mathematik, 1(1), 269.
Article MathSciNet Google Scholar
Fan, B., Pan, Q., & Zhang, H. C. (2005). A Multi-agent coordination method based on Markov game and application to robot soccer. Robot, 182(4), 357.
Google Scholar
Foerster, J. N., Assael, Y. M., Freitas, N. D., & Whiteson, S. (2016). Learning to communicate with deep multi-agent reinforcement learning. arXiv:1605.06676
Godoy, J. E., Karamouzas, I., Guy, S. J., & Gini, M. (2016). Implicit coordination in crowded multi-agent navigation, In Thirtieth AAAI conference on artificial intelligence, Phoenix, Arizona USA (pp. 2487–2493).
Gregory, J., Fink, J., Stump, E., Twigg, J., Rogers, J., Baran, D., Fung, N., & Young, S. (2016). Application of multi-robot systems to disaster-relief scenarios with limited communication, Springer Tracts in Advanced Robotics, vol. 113, field and service robotics edn. Springer.
Haarnoja, T., Zhou, A., Hartikainen, K., Tucker, G., Ha, S., Tan, J., et al. (2018). Soft actor-critic algorithms and applications. arXiv:1812.05905
Hao, J., Huang, D., Yi, C., & Leung, H. F. (2017). The dynamics of reinforcement social learning in networked cooperative multiagent systems. Engineering Applications of Artificial Intelligence, 58, 111.
Article Google Scholar
Hart, P. E., Nilsson, N. J., & Raphael, B. (1968). A formal basis for the heuristic determination of minimum cost paths. IEEE Transactions on Systems Science and Cybernetics, 4(2), 100.
Article Google Scholar
Hong, Z. W., Su, S. Y., Shann, T. Y., Chang, Y. H., & Lee, C. Y. (2017). A deep policy inference Q-network for multi-agent systems. arXiv:1712.07893
Hossain, M. A., & Ferdous, I. (2015). Autonomous robot path planning in dynamic environment using a new optimization technique inspired by bacterial foraging technique. Robotics and Autonomous Systems, 64, 137.
Article Google Scholar
Hoy, M., Matveev, A. S., & Savkin, A. V. (2012). Collision free cooperative navigation of multiple wheeled robots in unknown cluttered environments. Robotics and Autonomous Systems, 60(10), 1253.
Article Google Scholar
Huang, A. S., Olson, E., & Moore, D. C. (2010). LCM: Lightweight communications and marshalling. In IEEE/RSJ international conference on intelligent robots and systems (pp. 4057–4062).
Jur, V. D. B., Guy, S. J., Lin, M., & Manocha, D. (2011). Reciprocal n-body Collision Avoidance. Springer Tracts in Advanced Robotics, 70(4), 3.
MATH Google Scholar
Kala, R. (2014). Coordination in navigation of multiple mobile robots. Cybernetics and Systems, 45(1), 1.
Article Google Scholar
Lillicrap, T. P., Hunt, J. J., Pritzel, A., Heess, N., Erez, T., Tassa, Y., Silver, D., & Wierstra, D. (2015). Continuous control with deep reinforcement learning, arXiv preprint arXiv:1509.02971
Liu, F., Liang, S., & Xian, X. (2014). Optimal robot path planning for multiple goals visiting based on tailored genetic algorithm. International Journal of Computational Intelligence Systems, 7(6), 1109.
Article Google Scholar
Long, P., Fanl, T., Liao, X., Liu, W., Zhang, H., & Pan, J. (2018). Towards optimally decentralized multi-robot collision avoidance via deep reinforcement learning. In IEEE international conference on robotics and automation (ICRA), Brisbane, QLD, Australia (pp. 6252–6259).
Lowe, R., Wu, Y., Tamar, A., Harb, J., Abbeel, P., & Mordatch, I. (2017). Multi-agent actor-critic for mixed cooperative-competitive environments. Neural Information Processing Systems (NIPS).
Martinez-Alfaro, H. & Flugrad, D. R. (1994). Collision-free path planning for mobile robots and/or AGVS using simulated annealing In IEEE international conference on systems, man and cybernetics, San Antonio, TX, USA (Vol. 1, pp. 270–275).
Masehian, E., & Sedighizadeh, D. (2010). A multi-objective PSO-based algorithm for robot path planning. In IEEE international conference on industrial technology, Valparaiso, Chile (pp. 465–470).
Mathew, N., Smith, S. L., & Waslander, S. L. (2015). Planning paths for package delivery in heterogeneous multirobot teams. IEEE Transactions on Automation Science and Engineering, 12(4), 1298.
Article Google Scholar
Matignon, L., Laurent, G. J., & Fort-Piat, N. L. (2012). Independent reinforcement learners in cooperative Markov games: A survey regarding coordination problems. Knowledge Engineering Review, 27(1), 1.
Article Google Scholar
Metropolis, N., & Ulam, S. (1949). The monte carlo method. Journal of the American Statistical Association, 44(247), 335.
Article MathSciNet Google Scholar
Nazarahari, M., Khanmirza, E., & Doostie, S. (2019). Multi-objective multi-robot path planning in continuous environment using an enhanced genetic algorithm. Expert Systems with Applications, 115, 106.
Article Google Scholar
Niu, B., Yi, W., Tan, L., Geng, S., & Wang, H. (2019). A multi-objective feature selection method based on bacterial foraging optimization. Natural Computing, 1–14.
Olsder, G. J., & Papavassilopoulos, G. P. (1988). A markov chain game with dynamic information. Journal of Optimization Theory & Applications, 59(3), 467.
Article MathSciNet Google Scholar
Padakandla, S., & Bhatnagar, P. K. J. S. (2020). Reinforcement learning in non-stationary environments. https://doi.org/10.1007/s10489-020-01758-5.
Patle, B. K., Pandey, A., Jagadeesh, A., & Parhi, D. R. (2018). Path planning in uncertain environment by using firefly algorithm. Defence Technology, 14(06), 51.
Article Google Scholar
Patle, B., Pandey, A., Parhi, D., Jagadeesh, A., et al. (2019). A review: On path planning strategies for navigation of mobile robot. Defence Technology, 15, 582.
Article Google Scholar
Peng, X. B., Andrychowicz, M., Zaremba, W., & Abbeel, P. (2018). Sim-to-real transfer of robotic control with dynamics randomization. In 2018 IEEE international conference on robotics and automation (ICRA) (pp. 3803–3810).
Pinto, L., Andrychowicz, M., Welinder, P., Zaremba, W., & Abbeel, P. (2017). Asymmetric actor critic for image-based robot learning. arXiv:1710.06542v1
Qu, H., Xing, K., & Alexander, T. (2013). An improved genetic algorithm with co-evolutionary strategy for global path planning of multiple mobile robots. Neurocomputing, 120, 509.
Article Google Scholar
Raileanu, R., Denton, E., Szlam, A., & Fergus, R. (2018). Modeling others using oneself in multi-agent reinforcement learning. arXiv:1802.09640 (2018)
Sallab, A. E., Abdou, M., Perot, E., & Yogamani, S. (2017). Deep reinforcement learning framework for autonomous driving. Electronic Imaging, 19(7), 70.
Article Google Scholar
Silver, D. , Lever, G., Heess, N., Degris, T., Wierstra, D. & Riedmiller, M. (2014). deterministic policy gradient algorithms. In International conference on machine learning.
Snape, J., van den Berg, J., Guy, S. J., & Manocha, D. (2010). Smooth and collision-free navigation for multiple robots under differential-drive constraints. In 2010 IEEE/RSJ international conference on intelligent robots and systems (pp. 4584–4589).
Taboada, H. A., Espiritu, J. F., & Coit, D. W. (2008). MOMS-GA: A multi-objective multi-state genetic algorithm for system reliability optimization design problems. IEEE Transactions on Reliability, 57(1), 182.
Article Google Scholar
Tan, M. (1993). Multi-agent reinforcement learning: Independent vs. machine learning proceedings cooperative agents (pp. 330–337).
Xue, B., Zhang, M., & Browne, W. N. (2012). Particle swarm optimization for feature selection in classification: A multi-objective approach. IEEE Transactions on Cybernetics, 43(6), 1656.
Article Google Scholar
Yang, S. X., Hu, Y., & Meng M. Q.h. (2006). A knowledge based GA for path planning of multiple mobile robots in dynamic environments. In IEEE conference on robotics, automation and mechatronics (pp. 1–6).
You, X., Liu, K., & Liu, S. (2016). A chaotic ant colony system for path planning of mobile robot. International Journal of Hybrid Information Technology, 9(1), 329.
Article Google Scholar
Yusof, T., Toha, S. F., & Yusof, H. M. (2015). Path planning for visually impaired people in an unfamiliar environment using particle swarm optimization. Procedia Computer Science, 76, 80.
Article Google Scholar

Download references

Acknowledgements

This work is supported by the projects of National Natural Science Foundation of China (Nos. 61603277, 61873-192, 61733001), the Key Pre-Research Project of the 13th-Five-Year-Plan on Common Technology (No. 41412050101) . Meanwhile, this work is also partially supported by the Fundamental Research Funds for the Central Universities and the Youth 1000 program project. It is also partially sponsored by International Joint Project Between Shanghai of China and Baden-Württemberg of Germany (No. 19510711100) within Shanghai Science and Technology Innovation Plan, as well as the projects supported by China Academy of Space Technology and Launch Vehicle Technology. All these supports are highly appreciated. Besides, We thank the Shanghai Key Laboratory for Planetary Mapping and Remote Sensing for Deep Space Exploration for their support.

Author information

Authors and Affiliations

Laboratory of Robotics and Multibody System, School of Mechanical Engineering, Tongji University, Shanghai, 201804, People’s Republic of China
Jingtao Zhang, Zhipeng Xu, Fangchao Yu & Qirong Tang

Authors

Jingtao Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Zhipeng Xu
View author publications
You can also search for this author in PubMed Google Scholar
Fangchao Yu
View author publications
You can also search for this author in PubMed Google Scholar
Qirong Tang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Qirong Tang.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Below is the link to the electronic supplementary material.

Supplementary material 1 (mp4 25268 KB)

Supplementary material 2 (mp4 34926 KB)

Rights and permissions

Reprints and permissions

About this article

Cite this article

Zhang, J., Xu, Z., Yu, F. et al. A fully distributed multi-robot navigation method without pre-allocating target positions. Auton Robot 45, 473–492 (2021). https://doi.org/10.1007/s10514-021-09981-w

Download citation

Received: 17 January 2020
Accepted: 09 March 2021
Published: 10 April 2021
Issue Date: May 2021
DOI: https://doi.org/10.1007/s10514-021-09981-w

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A fully distributed multi-robot navigation method without pre-allocating target positions

Abstract

Access this article

Similar content being viewed by others

Multi-Agent Reinforcement Learning: A Selective Overview of Theories and Algorithms

Path Planning and Trajectory Planning Algorithms: A General Overview

Game-theoretic multi-agent motion planning in a mixed environment

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Supplementary Information

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation