A Residual Gradient Fuzzy Reinforcement Learning Algorithm for Differential Games

Awheda, Mostafa D.; Schwartz, Howard M.

doi:10.1007/s40815-016-0284-8

A Residual Gradient Fuzzy Reinforcement Learning Algorithm for Differential Games

Published: 16 February 2017

Volume 19, pages 1058–1076, (2017)
Cite this article

International Journal of Fuzzy Systems Aims and scope Submit manuscript

Mostafa D. Awheda¹ &
Howard M. Schwartz¹

450 Accesses
14 Citations
Explore all metrics

Abstract

In this work, we propose a new fuzzy reinforcement learning algorithm for differential games that have continuous state and action spaces. The proposed algorithm uses function approximation systems whose parameters are updated differently from the updating mechanisms used in the algorithms proposed in the literature. Unlike the algorithms presented in the literature which use the direct algorithms to update the parameters of their function approximation systems, the proposed algorithm uses the residual gradient value iteration algorithm to tune the input and output parameters of its function approximation systems. It has been shown in the literature that the direct algorithms may not converge to an answer in some cases, while the residual gradient algorithms are always guaranteed to converge to a local minimum. The proposed algorithm is called the residual gradient fuzzy actor–critic learning (RGFACL) algorithm. The proposed algorithm is used to learn three different pursuit–evasion differential games. Simulation results show that the performance of the proposed RGFACL algorithm outperforms the performance of the fuzzy actor–critic learning and the Q-learning fuzzy inference system algorithms in terms of convergence and speed of learning.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

References

Passino, K.M., Yurkovich, S.: Fuzzy control. Addison Wesley Longman, Inc., Menlo Park (1998)
MATH Google Scholar
Marin, N., Ruiz, M.D., Sanchez, D.: Fuzzy frameworks for mining data associations: fuzzy association rules and beyond. Wiley Interdiscip. Rev.: Data Min. Knowl. Discov. 6(2), 50–69 (2016)
Article Google Scholar
Micera, S., Sabatini, A.M., Dario, P.: Adaptive fuzzy control of electrically stimulated muscles for arm movements. Med. Biol. Eng. Comput. 37(6), 680–685 (1999)
Article Google Scholar
Daldaban, F., Ustkoyuncu, N., Guney, K.: Phase inductance estimation for switched reluctance motor using adaptive neuro- fuzzy inference system. Energy Convers. Manag. 47(5), 485–493 (2005)
Article Google Scholar
Jang, J.S.R., Sun, C.T., Mizutani, E.: Neuro-fuzzy and soft computing: a computational approach to learning and machine intelligence. Prentice Hall, Upper Saddle River (1997)
Google Scholar
Labiod, S., Guerra, T.M.: Adaptive fuzzy control of a class of SISO nonaffine nonlinear systems. Fuzzy Sets Syst. 158(10), 1126–1137 (2007)
Article MathSciNet MATH Google Scholar
Lam, H.K., Leung, F.H.F.: Fuzzy controller with stability and performance rules for nonlinear systems. Fuzzy Sets Syst. 158(2), 147–163 (2007)
Article MathSciNet MATH Google Scholar
Hagras, H., Callaghan, V., Colley, M.: Learning and adaptation of an intelligent mobile robot navigator operating in unstructured environment based on a novel online Fuzzy-Genetic system. Fuzzy Sets Syst. 141(1), 107–160 (2004)
Article Google Scholar
Mucientes, M., Moreno, D.L., Bugarn, A., Barro, S.: Design of a fuzzy controller in mobile robotics using genetic algorithms. Appl. Soft Comput. 7(2), 540–546 (2007)
Article Google Scholar
Wang, L.X.: A Course in Fuzzy Systems and Control. Prentice Hall, Upper Saddle River (1997)
MATH Google Scholar
Desouky, S.F., Schwartz, H.M.: Self-learning fuzzy logic controllers for pursuit-evasion differential games. Robot. Auton. Syst. 59, 22–33 (2011)
Article MATH Google Scholar
Barto, A.G., Sutton, R.S., Anderson, C.W.: Neuronlike adaptive elements that can solve difficult learning control problems. IEEE Trans. Syst. Man Cybern. 5, 834–846 (1983)
Article Google Scholar
Sutton, R.S., Barto, A.G.: Reinforcement Learning: An Introduction, 1.1. MIT press, Cambridge (1998)
Google Scholar
Kaelbling, L.P., Littman, M.L., Moore, A.W.: Reinforcement learning: a survey. J. Artif. Intell. Res. 4, 237–285 (1996)
Google Scholar
Awheda, M.D., Schwartz, H.M.: The residual gradient FACL algorithm for differential games. IN: IEEE 28th Canadian Conference on Electrical and Computer Engineering (CCECE), pp. 1006–1011 (2015)
Hinojosa, W., Nefti, S., Kaymak, U.: Systems control with generalized probabilistic fuzzy-reinforcement learning. IEEE Trans. Fuzzy Syst. 19(1), 51–64 (2011)
Article Google Scholar
Rodríguez, M., Iglesias, R., Regueiro, C.V., Correa, J., Barro, S.: Autonomous and fast robot learning through motivation. In: Robotics and Autonomous Systems, vol. 55.9, pp. 735–740. Elsevier (2007)
Schwartz, H.M.: Multi-agent machine learning: a reinforcement approach. Wiley, New York (2014)
Book MATH Google Scholar
Awheda, M.D., Schwartz, H.M.: A decentralized fuzzy learning algorithm for pursuit-evasion differential games with superior evaders. J. Intell. Robot. Syst. 83(1), 35–53 (2016)
Article Google Scholar
Awheda, M.D., Schwartz, H.M.: A fuzzy learning algorithm for multi-player pursuit-evasion differential games with superior evaders. In: Proceedings of the 2016 IEEE International Systems Conference, Orlando, Florida (2016)
Awheda, M.D., Schwartz, H.M.: A fuzzy reinforcement learning algorithm using a predictor for pursuit-evasion games. In: Proceedings of the 2016 IEEE International Systems Conference, Orlando, Florida (2016)
Smart, W.D., Kaelbling, L.P.: Effective reinforcement learning for mobile robots. In: IEEE International Conference on Robotics and Automation, Proceedings ICRA’02, 4 (2002)
Ye, C., Yung, N.H.C., Wang, D.: A fuzzy controller with supervised learning assisted reinforcement learning algorithm for obstacle avoidance. IEEE Trans. Syst. Man Cybern. Part B: Cybern. 33(1), 17–27 (2003)
Article Google Scholar
Kondo, T., Ito, K.: A reinforcement learning with revolutionary state recruitment strategy for autonomous mobile robots control. Robot. Auton. Syst. 46, 111–124 (2004)
Article Google Scholar
Gutnisky, D.A., Zanutto, B.S.: Learning obstacle avoidance with an operant behavior model. Artif. Life 10(1), 65–81 (2004)
Article Google Scholar
Dai, X., Li, C., Rad, A.B.: An approach to tune fuzzy controllers based on reinforcement learning for autonomous vehicle control. IEEE Trans. Intell. Transp. Syst. 6(3), 285–293 (2005)
Article Google Scholar
Luo, B., Wu, H.N., Huang, T.: Off-policy reinforcement learning for \(H_{\infty }\) control design. IEEE Trans. Cybern. 45.1, 65–76 (2015)
Article Google Scholar
Luo, B., Wu, H.N., Li, H.X.: Adaptive optimal control of highly dissipative nonlinear spatially distributed processes with neuro-dynamic programming. IEEE Trans. Neural Netw. Learn. Syst. 26.4, 684–696 (2015)
MathSciNet Google Scholar
Luo, B., Wu, H.N., Huang, T., Liu, D.: Reinforcement learning solution for HJB equation arising in constrained optimal control problem. In: Neural Networks, vol. 71, pp. 150–158. Elsevier (2015)
Modares, H., Lewis, F.L., Naghibi-Sistani, M.B.: Integral reinforcement learning and experience replay for adaptive optimal control of partially-unknown constrained-input continuous-time systems. In: Automatica, vol. 50.1, pp. 193–202. Elsevier (2014)
Dixon, W.: Optimal adaptive control and differential games by reinforcement learning principles, J. Guid. Control Dyn. 37.3, 1048–1049 (2014)
Article Google Scholar
Luo, B., Wu, H.N., Li, H.X.: Data-based suboptimal neuro-control design with reinforcement learning for dissipative spatially distributed processes, Ind. Eng. Chem. Res. 53.19, 8106–8119 (2014)
Article Google Scholar
Wu, H.N., Luo, B.: Neural network based online simultaneous policy update algorithm for solving the HJI equation in nonlinear control. IEEE Trans. Neural Netw. Learn. Syst. 23.12, 1884–1895 (2012)
Google Scholar
Xia, Z., Zhao, D.: Online reinforcement learning control by Bayesian inference. IET Control Theory Appl. 10(12), 1331–1338 (2016)
Article MathSciNet Google Scholar
Liu, Y.J., Gao, Y., Tong, S., Li, Y.: Fuzzy approximation-based adaptive backstepping optimal control for a class of nonlinear discrete-time systems with dead-zone. IEEE Trans. Fuzzy Syst. 24(1), 16–28 (2016)
Article Google Scholar
Zhu, Y., Zhao, D., Li, X.: Using reinforcement learning techniques to solve continuous-time non-linear optimal tracking problem without system dynamics. IET Control Theory Appl. 10(12), 1339–1347 (2016)
Article MathSciNet Google Scholar
Kamalapurkar, R., Walters, P., Dixon, W.E.: Model-based reinforcement learning for approximate optimal regulation. Automatica 64, 94–104 (2016)
Article MathSciNet MATH Google Scholar
Jiang, H., Zhang, H., Luo, Y., Wang, J.: Optimal tracking control for completely unknown nonlinear discrete-time Markov jump systems using data-based reinforcement learning method. Neurocomputing 194, 176–182 (2016)
Article Google Scholar
Sutton, R.S.: Learning to predict by the methods of temporal differences. Mach. Learn. 3(1), 9–44 (1988)
Google Scholar
Dayan, P., Sejnowski, T.J.: TD(\(\lambda\)) converges with probability 1. Mach. Learn. 14, 295–301 (1994)
Google Scholar
Dayan, P.: The convergence of TD(\(\lambda\)) for general \(\lambda\). Mach. Learn. 8(3–4), 341–362 (1992)
MATH Google Scholar
Jakkola, T., Jordan, M., Singh, S.: On the convergence of stochastic iterative dynamic programming. Neural Comput. 6, 1185–1201 (1993)
Article Google Scholar
Jouffe, L.: Fuzzy inference system learning by reinforcement methods. IEEE Trans. Syst. Man Cybern. C 28.3, 338–355 (1998)
Article Google Scholar
Bonarini, A., Lazaric, A., Montrone, F., Restelli, M.: Reinforcement distribution in fuzzy Q-learning. Fuzzy Sets Syst. 160(10), 1420–1443 (2009)
Article MathSciNet MATH Google Scholar
Desouky, S.F., Schwartz, H.M.: Q (\(\lambda\))-learning adaptive fuzzy logic controllers for pursuit–evasion differential games. Int. J. Adapt. Control Signal Process. 25(10), 910–927 (2011)
Article MathSciNet MATH Google Scholar
Givigi Jr., S.N., Schwartz, H.M., Lu, X.: A reinforcement learning adaptive fuzzy controller for differential games. J. Intell. Robot. Syst. 59, 3–30 (2010)
Article MATH Google Scholar
Wang, X.S., Cheng, Y.H., Yi, J.Q.: A fuzzy Actor–Critic reinforcement learning network. Inf. Sci. 177(18), 3764–3781 (2007)
Article Google Scholar
Baird, L.: Residual algorithms: reinforcement learning with function approximation. In: ICML, pp. 30–37 (1995)
Boyan, J., Moore, A.W.: Generalization in reinforcement learning: safely approximating the value function. In: Advances in Neural Information Processing Systems, vol. 7, pp. 369–376. Cambridge, MA, The MIT Press (1995)
Gordon, G.J.: Reinforcement learning with function approximation converges to a region. In: Advances in Neural Information Processing Systems, vol. 13, pp. 1040–1046. MIT Press (2001)
Schoknecht, R., Merke, A.: TD(0) converges provably faster than the residual gradient algorithm. In: ICML (2003)
Tsitsiklis, J.N., Roy, B.V.: An analysis of temporal-difference learning with function approximation. IEEE Trans. Autom. Control 42(5), 674–690 (1997)
Article MathSciNet MATH Google Scholar
Bowling, M., Veloso, M.: Multiagent learning using a variable learning rate. Artif. Intell. 136(2), 215–250 (2002)
Article MathSciNet MATH Google Scholar
Van Buijtenen, W.M., Schram, G., Babuska, R., Verbruggen, H.B.: Adaptive fuzzy control of satellite attitude by reinforcement learning. IEEE Trans. Fuzzy Syst. 6(2), 185–194 (1998)
Article Google Scholar
Mamdani, E.H., Assilian, S.: An experiment in linguistic synthesis with a fuzzy logic controller. Int. J. Man Mach. Stud. 7.1, 113 (1975)
MATH Google Scholar
Takagi, T., Sugeno, M.: Fuzzy identification of systems and its applications to modelling and control. IEEE Trans. Syst. Man Cybern. SMC 15(1), 116–132 (1985)
Article MATH Google Scholar
Sugeno, M., Kang, G.: Structure identification of fuzzy model. Fuzzy Sets Syst. 28, 15–33 (1988)
Article MathSciNet MATH Google Scholar
Isaacs, R.: Differential Games. Wiley, New York (1965)
MATH Google Scholar
LaValle, S.M.: Planning Algorithms. Cambridge University Press, Cambridge (2006)
Book MATH Google Scholar
Lim, S.H., Furukawa, T., Dissanayake, G., Whyte, H.F.D.: A time-optimal control strategy for pursuit-evasion games problems, In: International Conference on Robotics and Automation, New Orleans, LA (2004)
Desouky, S.F., Schwartz, H.M.: Different hybrid intelligent systems applied for the pursuit–evasion game. In: 2009 IEEE International Conference on Systems, Man, and Cybernetics, pp. 2677–2682 (2009)

Download references

Author information

Authors and Affiliations

Department of Systems and Computer Engineering, Carleton University, 1125 Colonel By Drive, Ottawa, ON, K1S 5B6, Canada
Mostafa D. Awheda & Howard M. Schwartz

Authors

Mostafa D. Awheda
View author publications
You can also search for this author in PubMed Google Scholar
Howard M. Schwartz
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Mostafa D. Awheda.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Awheda, M.D., Schwartz, H.M. A Residual Gradient Fuzzy Reinforcement Learning Algorithm for Differential Games. Int. J. Fuzzy Syst. 19, 1058–1076 (2017). https://doi.org/10.1007/s40815-016-0284-8

Download citation

Received: 17 November 2015
Revised: 02 November 2016
Accepted: 27 November 2016
Published: 16 February 2017
Issue Date: August 2017
DOI: https://doi.org/10.1007/s40815-016-0284-8

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A Residual Gradient Fuzzy Reinforcement Learning Algorithm for Differential Games

Abstract

Access this article

Similar content being viewed by others

A practical guide to multi-objective reinforcement learning and planning

Multi-Agent Reinforcement Learning: A Selective Overview of Theories and Algorithms

A review of cooperative multi-agent deep reinforcement learning

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

A Residual Gradient Fuzzy Reinforcement Learning Algorithm for Differential Games

Abstract

Access this article

Similar content being viewed by others

A practical guide to multi-objective reinforcement learning and planning

Multi-Agent Reinforcement Learning: A Selective Overview of Theories and Algorithms

A review of cooperative multi-agent deep reinforcement learning

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation