Skip to main content
Log in

A fully distributed multi-robot navigation method without pre-allocating target positions

  • Published:
Autonomous Robots Aims and scope Submit manuscript

Abstract

This study focuses on the multi-robot navigation problem with unpredictable state transition disturbance. The primary goal is to construct a fully distributed multi-robot navigation method without pre-allocating target positions. To this aim, a reinforcement learning based method is presented, in which a distribution of state transition module is proposed to guarantee adaptiveness when trained policies are applied in physical multi-robot systems. The method incorporates a centralized training but fully distributed execution framework. The former can eliminate non-stationarity of the environment, and the latter enables the robots to collaboratively handle partially observable scenarios. Mean while, the designed reward function can guide the robots to approach not pre-allocated target positions and the nearly optimal trajectories are achieved in continuous environment. After training, the robots make decisions independently, coordinate, and cooperate with each other to determine the next actions from their current positions before arriving in target positions without pre-allocation, in which the trajectories are nearly optimal with partial observation available for each robot. Simulations are performed with increasingly complex environments, such as the addition of static obstacles and randomly moving obstacles. The results show that the robots are able to achieve the primary goal with different state transition disturbance, which demonstrates the feasibility, effectiveness, and robustness. Furthermore, experiments are carried out using our multi-robot system corresponding to the simulation. The experimental results demonstrate the effectiveness and robustness of the proposed navigation method to handle a variety of typical robotic scenarios.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16
Fig. 17
Fig. 18

Similar content being viewed by others

References

  • Albani, D., IJsselmuiden, J., Haken, R., & Trianni, V. (2017). Monitoring and mapping with robot swarms for agricultural applications. In 14th IEEE international conference on advanced video and signal based surveillance (AVSS), Lecce, Italy (pp. 1–6).

  • Al-Jarrah, R., Shahzad, A., & Roth, H. (2015). Path planning and motion coordination for multi-robots system using probabilistic neuro-fuzzy. IFAC-PapersOnLine, 48(10), 46.

    Article  Google Scholar 

  • Anderson, P., Chang, A., Chaplot, D.S., Dosovitskiy, A., Gupta, S., Koltun, V., Kosecka, J., Malik, J., Mottaghi, R., Savva, M., & Zamir, A. R. (2018). On Evaluation of Embodied Navigation Agents. arXiv:1807.06757

  • Bareiss, D., & van den Berg, J. (2015). Generalized reciprocal collision avoidance. The International Journal of Robotics Research, 34(12), 1501.

    Article  Google Scholar 

  • Bewley, A., Rigley, J., Liu, Y., Hawke, J., Shen, R., Lam, V., & Kendall, A. (2019). Learning to drive from simulation without real world labels. In 2019 International conference on robotics and automation (ICRA) (pp. 4818–4824). https://doi.org/10.1109/ICRA.2019.8793668

  • Bien, Z., & Lee, J. (1992). A minimum-time trajectory planning method for two robots. IEEE Transactions on Robotics and Automation, 8(3), 414.

    Article  Google Scholar 

  • Canny, J. (1988). The complexity of robot motion planning. MIT Press.

  • Castillo, O., Trujillo, L., & Melin, P. (2007). Multiple objective genetic algorithms for path-planning optimization in autonomous mobile robots. Soft Computing, 11(3), 269.

    Article  Google Scholar 

  • Chen, Y. F., Everett, M., Liu, M., & How, J. P. (2017a). Socially aware motion planning with deep reinforcement learning. In IEEE/RSJ international conference on intelligent robots and systems (IROS), Vancouver, BC, Canada (pp. 1343–1350).

  • Chen, Y. F., Liu, M., Everett, M., & How, J. P. (2017b). Decentralized non-communicating multiagent collision avoidance with deep reinforcement learning. In IEEE international conference on robotics and automation (ICRA), Singapore (pp. 285–292).

  • Das, P. K., Behera, H. S., Das, S., Tripathy, H. K., Panigrahi, B. K., & Pradhan, S. (2016). A hybrid improved PSO-DV algorithm for multi-robot path planning in a clutter environment. Neurocomputing, 207(26), 735.

    Article  Google Scholar 

  • Deisenroth, M. P., Neumann, G., & Peters, J. (2013). A survey on policy search for robotics.

  • Dijkstra, E. W. (1959). A note on two problems in connexion with graphs. Numerische Mathematik, 1(1), 269.

    Article  MathSciNet  Google Scholar 

  • Fan, B., Pan, Q., & Zhang, H. C. (2005). A Multi-agent coordination method based on Markov game and application to robot soccer. Robot, 182(4), 357.

    Google Scholar 

  • Foerster, J. N., Assael, Y. M., Freitas, N. D., & Whiteson, S. (2016). Learning to communicate with deep multi-agent reinforcement learning. arXiv:1605.06676

  • Godoy, J. E., Karamouzas, I., Guy, S. J., & Gini, M. (2016). Implicit coordination in crowded multi-agent navigation, In Thirtieth AAAI conference on artificial intelligence, Phoenix, Arizona USA (pp. 2487–2493).

  • Gregory, J., Fink, J., Stump, E., Twigg, J., Rogers, J., Baran, D., Fung, N., & Young, S. (2016). Application of multi-robot systems to disaster-relief scenarios with limited communication, Springer Tracts in Advanced Robotics, vol. 113, field and service robotics edn. Springer.

  • Haarnoja, T., Zhou, A., Hartikainen, K., Tucker, G., Ha, S., Tan, J., et al. (2018). Soft actor-critic algorithms and applications. arXiv:1812.05905

  • Hao, J., Huang, D., Yi, C., & Leung, H. F. (2017). The dynamics of reinforcement social learning in networked cooperative multiagent systems. Engineering Applications of Artificial Intelligence, 58, 111.

    Article  Google Scholar 

  • Hart, P. E., Nilsson, N. J., & Raphael, B. (1968). A formal basis for the heuristic determination of minimum cost paths. IEEE Transactions on Systems Science and Cybernetics, 4(2), 100.

    Article  Google Scholar 

  • Hong, Z. W., Su, S. Y., Shann, T. Y., Chang, Y. H., & Lee, C. Y. (2017). A deep policy inference Q-network for multi-agent systems. arXiv:1712.07893

  • Hossain, M. A., & Ferdous, I. (2015). Autonomous robot path planning in dynamic environment using a new optimization technique inspired by bacterial foraging technique. Robotics and Autonomous Systems, 64, 137.

    Article  Google Scholar 

  • Hoy, M., Matveev, A. S., & Savkin, A. V. (2012). Collision free cooperative navigation of multiple wheeled robots in unknown cluttered environments. Robotics and Autonomous Systems, 60(10), 1253.

    Article  Google Scholar 

  • Huang, A. S., Olson, E., & Moore, D. C. (2010). LCM: Lightweight communications and marshalling. In IEEE/RSJ international conference on intelligent robots and systems (pp. 4057–4062).

  • Jur, V. D. B., Guy, S. J., Lin, M., & Manocha, D. (2011). Reciprocal n-body Collision Avoidance. Springer Tracts in Advanced Robotics, 70(4), 3.

    MATH  Google Scholar 

  • Kala, R. (2014). Coordination in navigation of multiple mobile robots. Cybernetics and Systems, 45(1), 1.

    Article  Google Scholar 

  • Lillicrap, T. P., Hunt, J. J., Pritzel, A., Heess, N., Erez, T., Tassa, Y., Silver, D., & Wierstra, D. (2015). Continuous control with deep reinforcement learning, arXiv preprint arXiv:1509.02971

  • Liu, F., Liang, S., & Xian, X. (2014). Optimal robot path planning for multiple goals visiting based on tailored genetic algorithm. International Journal of Computational Intelligence Systems, 7(6), 1109.

    Article  Google Scholar 

  • Long, P., Fanl, T., Liao, X., Liu, W., Zhang, H., & Pan, J. (2018). Towards optimally decentralized multi-robot collision avoidance via deep reinforcement learning. In IEEE international conference on robotics and automation (ICRA), Brisbane, QLD, Australia (pp. 6252–6259).

  • Lowe, R., Wu, Y., Tamar, A., Harb, J., Abbeel, P., & Mordatch, I. (2017). Multi-agent actor-critic for mixed cooperative-competitive environments. Neural Information Processing Systems (NIPS).

  • Martinez-Alfaro, H. & Flugrad, D. R. (1994). Collision-free path planning for mobile robots and/or AGVS using simulated annealing In IEEE international conference on systems, man and cybernetics, San Antonio, TX, USA (Vol. 1, pp. 270–275).

  • Masehian, E., & Sedighizadeh, D. (2010). A multi-objective PSO-based algorithm for robot path planning. In IEEE international conference on industrial technology, Valparaiso, Chile (pp. 465–470).

  • Mathew, N., Smith, S. L., & Waslander, S. L. (2015). Planning paths for package delivery in heterogeneous multirobot teams. IEEE Transactions on Automation Science and Engineering, 12(4), 1298.

    Article  Google Scholar 

  • Matignon, L., Laurent, G. J., & Fort-Piat, N. L. (2012). Independent reinforcement learners in cooperative Markov games: A survey regarding coordination problems. Knowledge Engineering Review, 27(1), 1.

    Article  Google Scholar 

  • Metropolis, N., & Ulam, S. (1949). The monte carlo method. Journal of the American Statistical Association, 44(247), 335.

    Article  MathSciNet  Google Scholar 

  • Nazarahari, M., Khanmirza, E., & Doostie, S. (2019). Multi-objective multi-robot path planning in continuous environment using an enhanced genetic algorithm. Expert Systems with Applications, 115, 106.

    Article  Google Scholar 

  • Niu, B., Yi, W., Tan, L., Geng, S., & Wang, H. (2019). A multi-objective feature selection method based on bacterial foraging optimization. Natural Computing, 1–14.

  • Olsder, G. J., & Papavassilopoulos, G. P. (1988). A markov chain game with dynamic information. Journal of Optimization Theory & Applications, 59(3), 467.

    Article  MathSciNet  Google Scholar 

  • Padakandla, S., & Bhatnagar, P. K. J. S. (2020). Reinforcement learning in non-stationary environments. https://doi.org/10.1007/s10489-020-01758-5.

  • Patle, B. K., Pandey, A., Jagadeesh, A., & Parhi, D. R. (2018). Path planning in uncertain environment by using firefly algorithm. Defence Technology, 14(06), 51.

    Article  Google Scholar 

  • Patle, B., Pandey, A., Parhi, D., Jagadeesh, A., et al. (2019). A review: On path planning strategies for navigation of mobile robot. Defence Technology, 15, 582.

    Article  Google Scholar 

  • Peng, X. B., Andrychowicz, M., Zaremba, W., & Abbeel, P. (2018). Sim-to-real transfer of robotic control with dynamics randomization. In 2018 IEEE international conference on robotics and automation (ICRA) (pp. 3803–3810).

  • Pinto, L., Andrychowicz, M., Welinder, P., Zaremba, W., & Abbeel, P. (2017). Asymmetric actor critic for image-based robot learning. arXiv:1710.06542v1

  • Qu, H., Xing, K., & Alexander, T. (2013). An improved genetic algorithm with co-evolutionary strategy for global path planning of multiple mobile robots. Neurocomputing, 120, 509.

    Article  Google Scholar 

  • Raileanu, R., Denton, E., Szlam, A., & Fergus, R. (2018). Modeling others using oneself in multi-agent reinforcement learning. arXiv:1802.09640 (2018)

  • Sallab, A. E., Abdou, M., Perot, E., & Yogamani, S. (2017). Deep reinforcement learning framework for autonomous driving. Electronic Imaging, 19(7), 70.

    Article  Google Scholar 

  • Silver, D. , Lever, G., Heess, N., Degris, T., Wierstra, D. & Riedmiller, M. (2014). deterministic policy gradient algorithms. In International conference on machine learning.

  • Snape, J., van den Berg, J., Guy, S. J., & Manocha, D. (2010). Smooth and collision-free navigation for multiple robots under differential-drive constraints. In 2010 IEEE/RSJ international conference on intelligent robots and systems (pp. 4584–4589).

  • Taboada, H. A., Espiritu, J. F., & Coit, D. W. (2008). MOMS-GA: A multi-objective multi-state genetic algorithm for system reliability optimization design problems. IEEE Transactions on Reliability, 57(1), 182.

    Article  Google Scholar 

  • Tan, M. (1993). Multi-agent reinforcement learning: Independent vs. machine learning proceedings cooperative agents (pp. 330–337).

  • Xue, B., Zhang, M., & Browne, W. N. (2012). Particle swarm optimization for feature selection in classification: A multi-objective approach. IEEE Transactions on Cybernetics, 43(6), 1656.

    Article  Google Scholar 

  • Yang, S. X., Hu, Y., & Meng M. Q.h. (2006). A knowledge based GA for path planning of multiple mobile robots in dynamic environments. In IEEE conference on robotics, automation and mechatronics (pp. 1–6).

  • You, X., Liu, K., & Liu, S. (2016). A chaotic ant colony system for path planning of mobile robot. International Journal of Hybrid Information Technology, 9(1), 329.

    Article  Google Scholar 

  • Yusof, T., Toha, S. F., & Yusof, H. M. (2015). Path planning for visually impaired people in an unfamiliar environment using particle swarm optimization. Procedia Computer Science, 76, 80.

    Article  Google Scholar 

Download references

Acknowledgements

This work is supported by the projects of National Natural Science Foundation of China (Nos. 61603277, 61873-192, 61733001), the Key Pre-Research Project of the 13th-Five-Year-Plan on Common Technology (No. 41412050101) . Meanwhile, this work is also partially supported by the Fundamental Research Funds for the Central Universities and the Youth 1000 program project. It is also partially sponsored by International Joint Project Between Shanghai of China and Baden-Württemberg of Germany (No. 19510711100) within Shanghai Science and Technology Innovation Plan, as well as the projects supported by China Academy of Space Technology and Launch Vehicle Technology. All these supports are highly appreciated. Besides, We thank the Shanghai Key Laboratory for Planetary Mapping and Remote Sensing for Deep Space Exploration for their support.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Qirong Tang.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Below is the link to the electronic supplementary material.

Supplementary material 1 (mp4 25268 KB)

Supplementary material 2 (mp4 34926 KB)

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Zhang, J., Xu, Z., Yu, F. et al. A fully distributed multi-robot navigation method without pre-allocating target positions. Auton Robot 45, 473–492 (2021). https://doi.org/10.1007/s10514-021-09981-w

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10514-021-09981-w

Keywords

Navigation