Processing math: 50%
Discrete-Time Deterministic <span class="MathJax_Preview">Q</span><script type="math/tex" id="MathJax-Element-1">Q</script> -Learning: A Novel Convergence Analysis | IEEE Journals & Magazine | IEEE Xplore

Discrete-Time Deterministic Q -Learning: A Novel Convergence Analysis


Abstract:

In this paper, a novel discrete-time deterministic Q-learning algorithm is developed. In each iteration of the developed Q-learning algorithm, the iterative Q function is...Show More

Abstract:

In this paper, a novel discrete-time deterministic Q-learning algorithm is developed. In each iteration of the developed Q-learning algorithm, the iterative Q function is updated for all the state and control spaces, instead of updating for a single state and a single control in traditional Q-learning algorithm. A new convergence criterion is established to guarantee that the iterative Q function converges to the optimum, where the convergence criterion of the learning rates for traditional Q-learning algorithms is simplified. During the convergence analysis, the upper and lower bounds of the iterative Q function are analyzed to obtain the convergence criterion, instead of analyzing the iterative Q function itself. For convenience of analysis, the convergence properties for undiscounted case of the deterministic Q-learning algorithm are first developed. Then, considering the discounted factor, the convergence criterion for the discounted case is established. Neural networks are used to approximate the iterative Q function and compute the iterative control law, respectively, for facilitating the implementation of the deterministic Q-learning algorithm. Finally, simulation results and comparisons are given to illustrate the performance of the developed algorithm.
Published in: IEEE Transactions on Cybernetics ( Volume: 47, Issue: 5, May 2017)
Page(s): 1224 - 1237
Date of Publication: 11 April 2016

ISSN Information:

PubMed ID: 27093714

Funding Agency:


References

References is not available for this document.