Skip to main content
Log in

Distributed Reinforcement Learning for Coordinate Multi-Robot Foraging

  • Unmanned Systems Paper
  • Published:
Journal of Intelligent & Robotic Systems Aims and scope Submit manuscript

Abstract

In this paper, we propose a distributed dynamic correlation matrix based multi-Q (D-DCM-Multi-Q) learning method for multi-robot systems. First, a dynamic correlation matrix is proposed for multi-agent reinforcement learning, which not only considers each individual robot’s Q-value, but also the correlated Q-values of neighboring robots. Then, the theoretical analysis of the system convergence for this D-DCM-Multi-Q method is provided. Various simulations for multi-robot foraging as well as a proof-of-concept experiment with a physical multi-robot system have been conducted to evaluate the proposed D-DCM-Multi-Q method. The extensive simulation/experimental results show the effectiveness, robustness, and stability of the proposed method.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Agogino, A.K., Tumer, K.: QUICR-learning for multi-agent coordination. In: Proceedings of the 21st National Conference on Artificial Intelligence, Boston, MA (2006)

  2. Balch, T., Arkin, R.C.: Behavior-based formation control for multi-agent teams. IEEE Trans. Robot Autom. 14(6), 926–939 (1998)

    Article  Google Scholar 

  3. Balch, T., Arkin, R.C.: Communication in reactive multiagent robotics systems. Auton. Robots 1(1), 27–52 (1995)

    Article  Google Scholar 

  4. Barto, A.G., Mahadevan, S.: Recent advances in hierarchical reinforcement learning. Discret. Event Dyn. Syst. 13, 41–77 (2003)

    Article  MATH  MathSciNet  Google Scholar 

  5. Bertsekas, D.P.: Dynamic Programming and Optimal Control. Athena Scientific, Nashua (2001)

    MATH  Google Scholar 

  6. Coggan, M.: Exploration and exploitation in reinforcement learning. In: Fourth International Conference on Computational Intelligence and Multimedia Applications (ICCIMA’01). Shonan International Village Yokosuka City (2001)

  7. Greenwald, A., Hall, K.: Correlated-Q learning. In: Proceedings of the 20th International Conference on Machine Learning (ICML 2003). Washington DC, USA (2003)

  8. Guo, H., Meng, Y.: Dynamic correlation matrix based multi-Q learning for a multi-robot system. In: IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 08). Nice, France (2008)

  9. Kapetanakis, S., Kudenko, D.: Reinforcement learning of coordination in heterogeneous cooperative multi-agent systems. In: Proceedings of the Third International Joint Conference on Autonomous Agents and Multiagent Systems. Columbia University in New York City (2004)

  10. Littman, M.L.: Markov games as a framework for multi-agent reinforcement learning. In: Proceedings of the Eleventh International Conference on Machine Learning, pp. 157–163. New Brunswick, NJ, USA (1994)

  11. Littman, M.L.: Friend-or-foe Q-learning in general-sum games. In: Proceedings of the 18th International Conference on Machine Learning, Williams College(Massachusetts) USA Morgan Kaufman, pp. 322–328 (2001)

  12. Marsella, S., Adibi, J., Al-Onaizan, Y., Kaminka, G., Muslea, I., Tambe, M.: On being a teammate: experiences acquired in the design of RoboCup teams. In: Etzioni, O., Muller, J., Bradshaw, J. (eds.) Proceedings of the Third Annual Conference on Autonomous Agents, pp. 221–227 (1999)

  13. Matignon, L., Laurent, G.J., Fort-Piat, N.L.: Hysteretic Q-learning: an algorithm for decentralized reinforcement learning in cooperative multi-agent teams. In: 2007 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS07). San Diego, CA, USA (2007)

  14. Martinoli, A., Ijspeert, A., Mondada, F.: Understanding collective aggregation mechanisms: from probabilistic modeling to experiments with real agents. Robot. Auton. Syst. 29, 51–63 (1999)

    Article  Google Scholar 

  15. McLurkin, J., Smith, J.: Distributed algorithms for dispersion in indoor environments using a swarm of autonomous mobile robots. In: Symposium on Distributed Autonomous Robotic Systems, Springer (2004)

  16. Meng, X., Babuska, R., Busoniu, L., Chen, Y., Tan, W.: An improved multiagent reinforcement learning algorithm. In: Proceedings of the IEEE/WIC/ACM International Conference on Intelligent Agent Technology—Volume 00 Compiegne University of Technology, France, pp. 337–343 (2005)

  17. Meng, Y., Gan, J.: LIVS: local interaction via virtual stigmergy coordination in distributed search and collective cleanup. In: IEEE/RSJ International Conference on Intelligent Robots and Systems, San Diego, CA, USA (2007)

  18. Nolfi, S., Floreano, D.: Evolutionary Robotics: the Biology, Intelligence, and Technology of Self-organizing Machines. MIT, Cambridge (2000)

    Google Scholar 

  19. Parker, L.E.: Distributed intelligence: overview of the field and its application in multi-robot systems. Invited article. Journal of Physical Agents 2(2), 5–14 (2008) (special issue on multi-robot systems)

    Google Scholar 

  20. Sutton, S., Barto, G.: Reinforcement Learning: an Introduction. MIT, Cambridge (1998)

    Google Scholar 

  21. Suematsu, N., Hayashi, A.: A multiagent reinforcement learning algorithm using extended optimal response. In: Proceedings of the First International Joint Conference on Autonomous Agents and Multiagent Systems: Part 1. Bologna, Italy (2002)

  22. Tambe, M.: Towards flexible teamwork. J. Artif. Intell. Res. 7, 83–124 (1997)

    Google Scholar 

  23. Watkins, C.J., Dayan, P.: Q-learning. Mach. Learn. 8, 279–292 (1992)

    MATH  Google Scholar 

  24. Zhang, X.: Matrix Analysis and Applications. Tsinghua University Press, Beijing (2004) ISBN 7-302-09271-0/0.390

    Google Scholar 

  25. Zheng, Y., Meng, Y.: Adaptive object tracking using particle swarm optimization. In: IEEE International Symposium on Computational Intelligence in Robotics and Automation, Jacksonville, Florida, USA (2007)

  26. Zlot, R., Stentz, A.: Market-based multirobot coordination for complex tasks. Int. J. Rob. Res. 25(1), 73–101 (2006)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yan Meng.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Guo, H., Meng, Y. Distributed Reinforcement Learning for Coordinate Multi-Robot Foraging. J Intell Robot Syst 60, 531–551 (2010). https://doi.org/10.1007/s10846-010-9429-4

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10846-010-9429-4

Keywords

Navigation