Skip to main content

Neural Network-Based Adaptive Optimal Controller – A Continuous-Time Formulation

  • Conference paper

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 15))

Abstract

We present a new online adaptive control scheme, for partially unknown nonlinear systems, which converges to the optimal state-feedback control solution for affine in the input nonlinear systems. The main features of the algorithm map on the characteristics of the rewards-based decision making process in the mammal brain.

The derivation of the optimal adaptive control algorithm is presented in a continuous-time framework. The optimal control solution will be obtained in a direct fashion, without system identification. The algorithm is an online approach to policy iterations based on an adaptive critic structure to find an approximate solution to the state feedback, infinite-horizon, optimal control problem.

This work was supported by NSF ECS-0501451, NSF ECCS-0801330 and ARO W91NF-05-1-0314.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   129.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Abu-Khalaf, M., Lewis, F.L.: Nearly Optimal Control Laws for Nonlinear Systems with Saturating Actuators Using a Neural Network HJB Approach. Automatica 41(5), 779–791 (2005)

    Article  MATH  MathSciNet  Google Scholar 

  2. Beard, R., Saridis, G., Wen, J.: Galerkin Approximations of the Generalized Hamilton-Jacobi-Bellman Equation. Automatica 33(12), 2159–2177 (1997)

    Article  MATH  MathSciNet  Google Scholar 

  3. Beard, R., Saridis, G., Wen, J.: Approximate Solutions to the Time-Invariant Hamilton-Jacobi-Bellman Equation. Journal of Optimization Theory and Application 96(3), 589–626 (1998)

    Article  MATH  MathSciNet  Google Scholar 

  4. Bertsekas, D.P., Tsitsiklis, J.N.: Neuro-Dynamic Programming. Athena Scientific, MA (1996)

    MATH  Google Scholar 

  5. Bradtke, S.J., Ydestie, B.E., Barto, A.G.: Adaptive Linear Quadratic Control Using Policy Iteration. In: Proc. of ACC, pp. 3475–3476, Baltimore (June 1994)

    Google Scholar 

  6. Brown, J., Bullock, D., Grossberg, S.: How the basal ganglia use parallel excitatory and inhibitory learning pathways to selectively respond to unexpected rewarding cues. J. Neuroscience 19, 10502–10511 (1999)

    Google Scholar 

  7. Doya, K.: Reinforcement Learning In Continuous Time and Space. Neural Computation 12(1), 219–245 (2000)

    Article  Google Scholar 

  8. Feldbaum, A.A.: Dual control theory I-II, Autom. Remote Control 21, 874–880, 1033–1039 (1960)

    MathSciNet  Google Scholar 

  9. Filatov, N.M., Unbehauen, H.: Survey of adaptive dual control methods. IEE Proc. Control Theory and Applications 147(1), 118–128 (2000)

    Article  Google Scholar 

  10. Hanselmann, T., Noakes, L., Zaknich, A.: Continuous-Time Adaptive Critics. IEEE Trans. on Neural Networks 18(3), 631–647 (2007)

    Article  Google Scholar 

  11. Hewer, G.: An Iterative Technique for the Computation of the Steady State Gains for the Discrete Optimal Regulator. IEEE Trans. on Automatic Control 16, 382–384 (1971)

    Article  Google Scholar 

  12. Hornik, K., Stinchcombe, M., White, H.: Universal approximation of an unknown mapping and its derivatives using multilayer feedforward networks. Neural Networks 3, 551–560 (1990)

    Article  Google Scholar 

  13. Howard, R.A.: Dynamic Programming and Markov Processes. MIT Press, Cambridge (1960)

    MATH  Google Scholar 

  14. Kleinman, D.: On an Iterative Technique for Riccati Equation Computations. IEEE Trans. on Automatic Control 13, 114–115 (1968)

    Article  Google Scholar 

  15. Levine, D.S., Brown, V.R., Shirey, V.T. (eds.): Oscillations in Neural Systems. Lawrence Erlbaum Associates, Mahwah (2000)

    Google Scholar 

  16. Lewis, F., Syrmos, V.: Optimal Control. Wiley, New York (1995)

    Google Scholar 

  17. Li, Z.H., Krstic, M.: Optimal design of adaptive tracking controllers for nonlinear systems. In: Proc. of ACC, pp. 1191–1197 (1997)

    Google Scholar 

  18. Miller, W.T., Sutton, R., Werbos, P.: Neural networks for control. MIT Press, Cambridge (1990)

    Google Scholar 

  19. Murray, J.J., Cox, C.J., Lendaris, G.G., Saeks, R.: Adaptive Dynamic Programming. IEEE Trans. on Systems, Man and Cybernetics 32(2), 140–153 (2002)

    Article  Google Scholar 

  20. Prokhorov, D., Wunsch, D.: Adaptive critic designs. IEEE Trans. on Neural Networks 8(5), 997–1007 (1997)

    Article  Google Scholar 

  21. Saridis, G., Lee, C.S.: An Approximation Theory of Optimal Control for Trainable Manipulators. IEEE Trans. on Systems, Man and Cybernetics 9(3), 152–159 (1979)

    Article  MATH  MathSciNet  Google Scholar 

  22. Schultz, W., Dayan, P., Read Montague, P.: A Neural Substrate of Prediction and Reward. Science 275, 1593–1599 (1997)

    Article  Google Scholar 

  23. Schultz, W.: Neural coding of basic reward terms of animal learning theory, game theory, microeconomics and behavioral ecology. Current Opinion in Neurobiology 14, 139–147 (2004)

    Article  Google Scholar 

  24. Slotine, J.J., Li, W.: Applied Nonlinear Control. Prentice-Hall, Englewood Cliffs (1991)

    MATH  Google Scholar 

  25. Sutton, R.S., Barto, A.G.: Reinforcement Learning – An introduction. MIT Press, Cambridge (1998)

    Google Scholar 

  26. Vrabie, D., Pastravanu, O., Lewis, F.L.: Policy Iteration for Continuous-time Systems with Unknown Internal Dynamics. In: Proc. of MED (2007)

    Google Scholar 

  27. Watkins, C.J.C.H.: Learning from delayed rewards. PhD Thesis, University of Cambridge, England (1989)

    Google Scholar 

  28. Werbos P.: Neural networks for control and system identification. In: IEEE Proc. CDC 1989 (1989)

    Google Scholar 

  29. Werbos, P.: Approximate dynamic programming for real-time control and neural modeling. In: White, D.A., Sofge, D.A. (eds.) Handbook of Intelligent Control, Van Nostrand Reinhold, New York (1992)

    Google Scholar 

  30. Wittenmark, B.: Adaptive dual control methods: An overview. In: 5th IFAC Symp. on Adaptive Systems in Control and Signal Processing, pp. 67–73 (1995)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

De-Shuang Huang Donald C. Wunsch II Daniel S. Levine Kang-Hyun Jo

Rights and permissions

Reprints and permissions

Copyright information

© 2008 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Vrabie, D., Lewis, F., Levine, D. (2008). Neural Network-Based Adaptive Optimal Controller – A Continuous-Time Formulation. In: Huang, DS., Wunsch, D.C., Levine, D.S., Jo, KH. (eds) Advanced Intelligent Computing Theories and Applications. With Aspects of Contemporary Intelligent Computing Techniques. ICIC 2008. Communications in Computer and Information Science, vol 15. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-85930-7_37

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-85930-7_37

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-85929-1

  • Online ISBN: 978-3-540-85930-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics