Actor-Critic Learning Control Based on --Regularized Temporal-Difference Prediction With Gradient Correction | IEEE Journals & Magazine | IEEE Xplore