Policy Gradient for Continuing Tasks in Discounted Markov Decision Processes | IEEE Journals & Magazine | IEEE Xplore