Abstract
We present a new online adaptive control scheme, for partially unknown nonlinear systems, which converges to the optimal state-feedback control solution for affine in the input nonlinear systems. The main features of the algorithm map on the characteristics of the rewards-based decision making process in the mammal brain.
The derivation of the optimal adaptive control algorithm is presented in a continuous-time framework. The optimal control solution will be obtained in a direct fashion, without system identification. The algorithm is an online approach to policy iterations based on an adaptive critic structure to find an approximate solution to the state feedback, infinite-horizon, optimal control problem.
This work was supported by NSF ECS-0501451, NSF ECCS-0801330 and ARO W91NF-05-1-0314.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Abu-Khalaf, M., Lewis, F.L.: Nearly Optimal Control Laws for Nonlinear Systems with Saturating Actuators Using a Neural Network HJB Approach. Automatica 41(5), 779–791 (2005)
Beard, R., Saridis, G., Wen, J.: Galerkin Approximations of the Generalized Hamilton-Jacobi-Bellman Equation. Automatica 33(12), 2159–2177 (1997)
Beard, R., Saridis, G., Wen, J.: Approximate Solutions to the Time-Invariant Hamilton-Jacobi-Bellman Equation. Journal of Optimization Theory and Application 96(3), 589–626 (1998)
Bertsekas, D.P., Tsitsiklis, J.N.: Neuro-Dynamic Programming. Athena Scientific, MA (1996)
Bradtke, S.J., Ydestie, B.E., Barto, A.G.: Adaptive Linear Quadratic Control Using Policy Iteration. In: Proc. of ACC, pp. 3475–3476, Baltimore (June 1994)
Brown, J., Bullock, D., Grossberg, S.: How the basal ganglia use parallel excitatory and inhibitory learning pathways to selectively respond to unexpected rewarding cues. J. Neuroscience 19, 10502–10511 (1999)
Doya, K.: Reinforcement Learning In Continuous Time and Space. Neural Computation 12(1), 219–245 (2000)
Feldbaum, A.A.: Dual control theory I-II, Autom. Remote Control 21, 874–880, 1033–1039 (1960)
Filatov, N.M., Unbehauen, H.: Survey of adaptive dual control methods. IEE Proc. Control Theory and Applications 147(1), 118–128 (2000)
Hanselmann, T., Noakes, L., Zaknich, A.: Continuous-Time Adaptive Critics. IEEE Trans. on Neural Networks 18(3), 631–647 (2007)
Hewer, G.: An Iterative Technique for the Computation of the Steady State Gains for the Discrete Optimal Regulator. IEEE Trans. on Automatic Control 16, 382–384 (1971)
Hornik, K., Stinchcombe, M., White, H.: Universal approximation of an unknown mapping and its derivatives using multilayer feedforward networks. Neural Networks 3, 551–560 (1990)
Howard, R.A.: Dynamic Programming and Markov Processes. MIT Press, Cambridge (1960)
Kleinman, D.: On an Iterative Technique for Riccati Equation Computations. IEEE Trans. on Automatic Control 13, 114–115 (1968)
Levine, D.S., Brown, V.R., Shirey, V.T. (eds.): Oscillations in Neural Systems. Lawrence Erlbaum Associates, Mahwah (2000)
Lewis, F., Syrmos, V.: Optimal Control. Wiley, New York (1995)
Li, Z.H., Krstic, M.: Optimal design of adaptive tracking controllers for nonlinear systems. In: Proc. of ACC, pp. 1191–1197 (1997)
Miller, W.T., Sutton, R., Werbos, P.: Neural networks for control. MIT Press, Cambridge (1990)
Murray, J.J., Cox, C.J., Lendaris, G.G., Saeks, R.: Adaptive Dynamic Programming. IEEE Trans. on Systems, Man and Cybernetics 32(2), 140–153 (2002)
Prokhorov, D., Wunsch, D.: Adaptive critic designs. IEEE Trans. on Neural Networks 8(5), 997–1007 (1997)
Saridis, G., Lee, C.S.: An Approximation Theory of Optimal Control for Trainable Manipulators. IEEE Trans. on Systems, Man and Cybernetics 9(3), 152–159 (1979)
Schultz, W., Dayan, P., Read Montague, P.: A Neural Substrate of Prediction and Reward. Science 275, 1593–1599 (1997)
Schultz, W.: Neural coding of basic reward terms of animal learning theory, game theory, microeconomics and behavioral ecology. Current Opinion in Neurobiology 14, 139–147 (2004)
Slotine, J.J., Li, W.: Applied Nonlinear Control. Prentice-Hall, Englewood Cliffs (1991)
Sutton, R.S., Barto, A.G.: Reinforcement Learning – An introduction. MIT Press, Cambridge (1998)
Vrabie, D., Pastravanu, O., Lewis, F.L.: Policy Iteration for Continuous-time Systems with Unknown Internal Dynamics. In: Proc. of MED (2007)
Watkins, C.J.C.H.: Learning from delayed rewards. PhD Thesis, University of Cambridge, England (1989)
Werbos P.: Neural networks for control and system identification. In: IEEE Proc. CDC 1989 (1989)
Werbos, P.: Approximate dynamic programming for real-time control and neural modeling. In: White, D.A., Sofge, D.A. (eds.) Handbook of Intelligent Control, Van Nostrand Reinhold, New York (1992)
Wittenmark, B.: Adaptive dual control methods: An overview. In: 5th IFAC Symp. on Adaptive Systems in Control and Signal Processing, pp. 67–73 (1995)
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 2008 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Vrabie, D., Lewis, F., Levine, D. (2008). Neural Network-Based Adaptive Optimal Controller – A Continuous-Time Formulation. In: Huang, DS., Wunsch, D.C., Levine, D.S., Jo, KH. (eds) Advanced Intelligent Computing Theories and Applications. With Aspects of Contemporary Intelligent Computing Techniques. ICIC 2008. Communications in Computer and Information Science, vol 15. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-85930-7_37
Download citation
DOI: https://doi.org/10.1007/978-3-540-85930-7_37
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-85929-1
Online ISBN: 978-3-540-85930-7
eBook Packages: Computer ScienceComputer Science (R0)