Convergence and Iteration Complexity of Policy Gradient Method for Infinite-horizon Reinforcement Learning | IEEE Conference Publication | IEEE Xplore