Skip to main content
Log in

A new self-learning optimal control laws for a class of discrete-time nonlinear systems based on ESN architecture

  • Research Paper
  • Published:
Science China Information Sciences Aims and scope Submit manuscript

Abstract

A novel self-learning optimal control method for a class of discrete-time nonlinear systems is proposed based on iteration adaptive dynamic programming (ADP) algorithm. It is proven that the iteration costate functions converge to the optimal one, and a detailed convergence analysis of the iteration ADP algorithm is given. Furthermore, echo state network (ESN) architecture is used as the approximator of the costate function for each iteration. To ensure the reliability of the ESN approximator, the ESN mean square training error is constrained in the satisfactory range. Two simulation examples are given to demonstrate that the proposed control method has a fast response speed due to the special structure and the fast training process.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Werbos P J. A Menu of Designs for Reinforcement Learning Over Time, in Neural Networks for Control. Massachusetts: MIT Press, 1991. 67–95

    Google Scholar 

  2. Werbos P J. Approximate Dynamic Programming for Real-Time Control and Neural Modeling, in Handbook of Intelligent Control: Neural, Fuzzy, and Adaptive Approaches. New York: Van Nostrand Reinhold, 1992.

    Google Scholar 

  3. Liu D, Javaherian H, Kovalenko O, et al. Adaptive critic learning techniques for engine torque and air-fuel ratio control. IEEE Trans Syst Man Cybern B Cybern, 2008, 38: 988–993

    Article  Google Scholar 

  4. Liu D, Xiong X, Zhang Y. Action-dependent adaptive critic designs. In: Proceedings of International Joint Conference on Neural Networks, Washington, 2001. 2: 990–995

    Google Scholar 

  5. Liu D, Zhang H. A neural dynamic programming approach for learning control of failure avoidance problems. Int J Intell Syst, 2005, 10: 21–32

    Google Scholar 

  6. Liu D, Zhang Y, Zhang H. A self-learning call admission control scheme for CDMA cellular networks. IEEE Trans Neural Netw, 2005, 16: 1219–1228

    Article  Google Scholar 

  7. Powell W B. Approximate Dynamic Programming: Solving the Curses of Dimensionality. New York: Wiley, 2009

    Google Scholar 

  8. Zheng C, Jagannathan S. Generalized Hamilton-Jacobi-Bellman formulation-based neural network control of affine nonlinear discrete-time systems. IEEE Trans Neural Netw, 2008, 19: 90–106

    Article  Google Scholar 

  9. He P, Jagannathan S. Reinforcement learning neural-network-based controller for nonlinear discrete-time systems with input constraints. IEEE Trans Syst Man Cybern B Cybern, 2007, 37: 425–436

    Article  Google Scholar 

  10. Al-Tamimi A, Lewis F L, Abu-Khalaf M. Model-free Q-learning designs for linear discrete-time zero-sum games with application to H-infinity control. Automatica, 2007, 43: 473–481

    Article  MATH  MathSciNet  Google Scholar 

  11. Vrabie D, Pastravanu O, Abu-Khalaf M, et al. Adaptive optimal control for continuous-time linear systems based on policy iteration. Automatica, 2009, 45: 477–484

    Article  MATH  MathSciNet  Google Scholar 

  12. Vamvoudakis K G, Lewis F L. Online actor-critic algorithm to solve the continuous-time infinite horizon optimal control problem. Automatica, 2010, 46: 878–888

    Article  MATH  MathSciNet  Google Scholar 

  13. Abu-Khalaf M, Lewis F L. Nearly optimal control laws for nonlinear systems withsaturating actuators using a neural network HJB approach. Automatica, 2005, 41: 779–791

    Article  MATH  MathSciNet  Google Scholar 

  14. Murray J J, Cox C J, Lendaris G G, et al. Adaptive dynamic programming. IEEE Trans Syst Man Cybern C Appl Rev, 2002, 32: 140–153

    Article  Google Scholar 

  15. Si J, Wang Y T. On-line learning control by association and reinforcement. IEEE Trans Neural Netw, 2001, 12: 264–276

    Article  Google Scholar 

  16. Enns R, Si J. Helicopter trimming and tracking control using direct neural dynamic programming. IEEE Trans Neural Netw, 2003, 14: 929–939

    Article  Google Scholar 

  17. Zhang H G, Wei Q L, Luo Y H. A novel infinite-time optimal tracking control scheme for a class of discrete-time nonlinear systems via the greedy HDP iteration algorithm. IEEE Trans Syst Man Cybern B Cybern, 2008, 38: 937–942

    Article  Google Scholar 

  18. Zhang H G, Luo Y H, Liu D R. Neural-network-based near-optimal control for a class of discrete-time affine nonlinear systems with control constraints. IEEE Trans Neural Netw, 2009, 20: 1490–1503

    Article  Google Scholar 

  19. Wang F Y, Jin N, Liu D R, et al. Adaptive dynamic programming for finite horizon optimal control of discrete-time nonlinear systems with ɛ-error bound. IEEE Trans Neural Netw, 2011, 22: 24–36

    Article  Google Scholar 

  20. Zhang H G, Wei Q L, Liu D R. An iterative adaptive dynamic programming method for solving a class of nonlinear zero-sum differential games. Automatica, 2011, 47: 207–214

    Article  MATH  MathSciNet  Google Scholar 

  21. Chang W D, Hwang R C, Hsieh J G. Stable direct adaptive neural controller of nonlinear systems based on single auto-tuning neuron. Neurocomputing, 2002, 48: 541–554

    Article  MATH  Google Scholar 

  22. Du H B, Chen X C. NN-based output feedback adaptive variable structure control for a class of non-affine nonlinear systems: A nonseparation principle design. Neurocomputing, 2009, 72: 2009–2016

    Article  Google Scholar 

  23. Song R Z, Zhang H G, Luo Y H, et al. Optimal control laws for time-delay systems with saturating actuators based on heuristic dynamic programming. Neurocomputing, 2010, 73: 3020–3027

    Article  Google Scholar 

  24. Wei Q L, Zhang H G, Dai J. Model-free multiobjective approximate dynamic programming for discrete-time nonlinear systems with general performance index functions. Neurocomputing, 2009, 72: 1839–1848

    Article  Google Scholar 

  25. Li X, Xian B, Diao C, et al. Output feedback control of hypersonic vehicles based on neural network and high gain observer. Sci China Inf Sci, 2011, 54: 429–447

    Article  MATH  MathSciNet  Google Scholar 

  26. Xu B, Gao D, Wang S. Adaptive neural control based on HGO for hypersonic flight vehicles. Sci China Inf Sci, 2011, 54: 511–520

    Article  MATH  MathSciNet  Google Scholar 

  27. Wang M, Zhang S, Chen B, et al. Direct adaptive neural control for stabilization of nonlinear time-delay systems. Sci China Inf Sci, 2010, 53: 800–812

    Article  MathSciNet  Google Scholar 

  28. Huang Z, Wang X, Sannay M. Self-excitation of neurons leads to multiperiodicity of discrete-time neural networks with distributed delays. Sci China Inf Sci, 2011, 54: 305–317

    Article  MATH  MathSciNet  Google Scholar 

  29. Jaeger H. A Tutorial on Training Recurrent Neural Networks, Covering BPPT, RTRL, EKF and the Echo State Network Approach. Bremen: International University Bremen, 2002

    Google Scholar 

  30. Čerňanský M. Feed-forward Echo State Networks. In: Proceedings of International Joint Conference on Neural Networks, Montreal, 2005. 14 1479–148

    Google Scholar 

  31. Liu Z, Zhang H, Zhang Q. Novel stability analysis for recurrent neural networks with multiple delays via line integraltype L-K functional. IEEE Trans Neural Netw, 2010, 21: 1710–1718

    Article  Google Scholar 

  32. Zhang H, Liu Z, Huang G, et al. Novel weighting-delay-based stability criteria for recurrent neural networks with time-varying delay. IEEE Trans Neural Netw, 2010, 21: 91–106

    Article  Google Scholar 

  33. Lukoševičius M, Popovici D, Jaeger H, et al. T Time warping invariant echo state networks, 2006. A Available form: http://jpubs.jacobs-university.de/bitstream/579/149/1/twiesniubtechreport.pd

    Google Scholar 

  34. Koprinkova-Hristova P, Oubbati M, Palm G. Adaptive critic design with echo state network. In: Proceedings of the IEEE International Conference on Systems Man and Cybernetics, Istanbul, 2010. 1010–1015

    Google Scholar 

  35. Jaeger H. The Echo State Approach to Analysing and Training Recurrent Neural Networks. GMD Report 148, GMDGerman National Research Institute for Computer Science. 2001

    Google Scholar 

  36. Jaeger H. Short Term Memory in Echo State Networks. GMD Report 152, GMD-German National Research Institute for Computer Science. 2002

    Google Scholar 

  37. Prokhorov D. Echo state networks: appeal and challenges. In: Proceedings of the International Joint Conference on Neural Networks, Montreal, 2005. 1463–1466

    Google Scholar 

  38. Rodan A, Tiňo P. Minimum complexity echo state network. IEEE Trans Neural Netw, 2011, 22: 131–144

    Article  Google Scholar 

  39. Xia Y L, Jelfs B, van Hulle Marc M, et al. An augmented echo state network for nonlinear adaptive filtering of complex noncircular signals, IEEE Trans Neural Netw, 2011, 22: 74–83

    Article  Google Scholar 

  40. Lin W S. Optimality and convergence of adaptive optimal control by reinforcement synthesis. Automatica, 2008, 44: 2716–2723

    Article  MATH  Google Scholar 

  41. Zhang H G, Song R Z, Wei Q L, et al. Optimal tracking control for a class of nonlinear discrete-time systems with time delays based on heuristic dynamic programming. IEEE Trans Neural Netw, 2011, 22: 1851–1862

    Article  Google Scholar 

  42. Al-Tamimi A, Lewis F L. Discrete-time nonlinear HJB solution using approximate dynamic programming: convergence proof. IEEE Trans Syst Man Cybern B Cybern, 2007, 38: 943–949

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to RuiZhuo Song.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Song, R., Xiao, W. & Sun, C. A new self-learning optimal control laws for a class of discrete-time nonlinear systems based on ESN architecture. Sci. China Inf. Sci. 57, 1–10 (2014). https://doi.org/10.1007/s11432-013-4954-y

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11432-013-4954-y

Keywords

Navigation