Abstract:
We present a novel multi-objective reinforcement learning formulation of the decentralized formation control problem for swarms of fixed-wing UAVs based on a relative sta...Show MoreMetadata
Abstract:
We present a novel multi-objective reinforcement learning formulation of the decentralized formation control problem for swarms of fixed-wing UAVs based on a relative state-space construction of obstacles and waypoints. Specifically, a modular state-action-reward-state-action (SARSA) approach is applied to a set of behavioral swarm rules (i.e., Reynolds flocking, target seek, and obstacle avoidance). Q-tables are trained offline for each rule using a simulated model of the UAV plant over scenarios with randomized waypoints and obstacle locations. The greatest mass approach is applied to action selection for each UAV by selecting the relative heading that maximizes the weighted sum of Q-values across all Q-tables. To restrain the dimensionality of the state space, object-focused learning is applied during policy evaluation to instantiate multiple Q-tables for each detected neighbor UAV and obstacle. In comparison to existing behavioral swarm controllers, the controller learned via object-focused greatest mass SARSA (OF-GM-SARSA) is generalizable across multiple scenarios in terms of demonstrating desired swarm behaviors while providing a negliaible number of collisions.
Published in: 2018 Annual American Control Conference (ACC)
Date of Conference: 27-29 June 2018
Date Added to IEEE Xplore: 16 August 2018
ISBN Information:
Electronic ISSN: 2378-5861