Abstract:
End to end learning systems are becoming increasingly common in autonomous driving research, from perception, to planning and control. In particular, distributed reinforc...Show MoreMetadata
Abstract:
End to end learning systems are becoming increasingly common in autonomous driving research, from perception, to planning and control. In particular, distributed reinforcement learning systems have demonstrated their applicability to the intersection navigation scenario. Such systems learn via a scalar reward signal from the environment and its design is crucial to the overall performance at the task. In this paper, we investigate an alternative approach to achieving desirable behavior by instead applying constraints to the action spaces and policies of the agents while maintaining a relatively sparse reward regimen. Initial experiments in a simulation environment have demonstrated the efficacy of this approach with simple restrictions in a discrete action space when compared to traditional traffic signal controllers and other Q-learning MARL algorithms. The performance analysis suggest that a more flexible action restriction may be more appropriate but nonetheless validates the utility of the approach by minimising delay and time loss, which we hope will stimulate additional research in policy constraints for autonomous driving.
Date of Conference: 24-28 September 2023
Date Added to IEEE Xplore: 13 February 2024
ISBN Information: