Abstract:
This work addresses a multi-agent cooperative navigation problem that multiple agents work together in an unknown environment in order to reach different targets without ...Show MoreMetadata
Abstract:
This work addresses a multi-agent cooperative navigation problem that multiple agents work together in an unknown environment in order to reach different targets without collision and minimize the maximum navigation time they spend. Typical reinforcement learning-based solutions directly model the cooperative navigation policy as a steering policy. However, when each agent does not know which target to head for, this method could prolong convergence time and reduce overall performance. To this end, we model the navigation policy as a combination of a dynamic target selection policy and a collision avoidance policy. Since these two policies are coupled, an interlaced deep reinforcement learning method is proposed to simultaneously learn them. Additionally, a reward function is directly derived from the optimization objective function instead of using a heuristic design method. Extensive experiments demonstrate that the proposed method can converge in a fast way and generate a more efficient navigation policy compared with the state-of-the-art.
Published in: ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
Date of Conference: 12-17 May 2019
Date Added to IEEE Xplore: 17 April 2019
ISBN Information: