Loading [a11y]/accessibility-menu.js
Friend-or-Foe Deep Deterministic Policy Gradient | IEEE Conference Publication | IEEE Xplore

Friend-or-Foe Deep Deterministic Policy Gradient


Abstract:

One of the toughest challenges in the multi-agent deep reinforcement learning (MADRL) is that when the opponents' policies change rapidly, the collaborative agents can't ...Show More

Abstract:

One of the toughest challenges in the multi-agent deep reinforcement learning (MADRL) is that when the opponents' policies change rapidly, the collaborative agents can't learn well to respond to the opponents' policies effectively. This may lead to a local optimum w.r.t. the learned policy of the collaborative agents may be only locally optimal to the opponents' current policies. To address this problem, we propose a novel algorithm termed Friend-or-Foe Deep Deterministic Policy Gradient (FD2PG), in which the cooperative agents can be trained more robust and have stronger cooperation ability in continuous action space. These collaborative agents can generalize easily and respond correctly, even if their opponents' policies alter. Inspired by the classic Friend-or-Foe Q-learning algorithm (FFQ), we introduce the idea of minimizing the foes and maximizing the friends into the centralized training distributed execution framework, multi-agent deep deterministic policy gradient algorithm (MADDPG), to enhance collaborative agents' robustness and cooperativity. Besides, we introduce a Minimax Multi-Agent Learning (MMAL) method to explore two special equilibriums (the adversarial equilibrium and the coordination equilibrium), which can guarantee the convergence of FD2PG and improve optimization. Extensive fine-grained experiments, including four representative scenario experiments and two scale-performance correlation experiments, were conducted to demonstrate the superior performance of FD2PG comparing with existing baselines.
Date of Conference: 11-14 October 2020
Date Added to IEEE Xplore: 14 December 2020
ISBN Information:

ISSN Information:

Conference Location: Toronto, ON, Canada

Funding Agency:


References

References is not available for this document.