Deep Policy-Gradient Based Path Planning and Reinforcement Cooperative Q-Learning Behavior of Multi-Vehicle Systems | IEEE Conference Publication | IEEE Xplore