Neural-network-based robust optimal control design for a class of uncertain nonlinear systems via adaptive dynamic programming

doi:10.1016/j.ins.2014.05.050

Information Sciences

Volume 282, 20 October 2014, Pages 167-179

https://doi.org/10.1016/j.ins.2014.05.050 Get rights and content

Abstract

In this paper, the neural-network-based robust optimal control design for a class of uncertain nonlinear systems via adaptive dynamic programming approach is investigated. First, the robust controller of the original uncertain system is derived by adding a feedback gain to the optimal controller of the nominal system. It is also shown that this robust controller can achieve optimality under a specified cost function, which serves as the basic idea of the robust optimal control design. Then, a critic network is constructed to solve the Hamilton–Jacobi–Bellman equation corresponding to the nominal system, where an additional stabilizing term is introduced to verify the stability. The uniform ultimate boundedness of the closed-loop system is also proved by using the Lyapunov approach. Moreover, the obtained results are extended to solve decentralized optimal control problem of continuous-time nonlinear interconnected large-scale systems. Finally, two simulation examples are presented to illustrate the effectiveness of the established control scheme.

Introduction

In practical control systems, model uncertainties arise frequently and can severely degrade the closed-loop system performance. Hence, the problem of designing robust controller for nonlinear systems with uncertainties has drawn considerable attention in recent literature [43], [15], [31]. Lin et al. [15] showed that the robust control problem can be solved by studying the optimal control problem of the corresponding nominal system, but the detailed procedure was not presented. In [31], the authors developed an iterative algorithm for online design of robust control for a class of continuous-time nonlinear systems. However, the optimality of the robust controller with respect to a specified cost function was not discussed. In [43], the authors addressed the problem of designing robust tracking controls for a class of uncertain nonholonomic systems actuated by brushed direct current motors, while the research was not related with the optimality.

The starting point of the obtained strategy of this paper is optimal control. The nonlinear optimal control problem always requires to solve the Hamilton–Jacobi–Bellman (HJB) equation. Though dynamic programming has been a conventional method in solving optimization and optimal control problems, it often suffers from the curse of dimensionality, which was primarily due to the backward-in-time approach. To avoid the difficulty, based on function approximators, such as neural networks, adaptive/approximate dynamic programming (ADP) was proposed by Werbos [35] as a method to solve optimal control problems forward-in-time. Recently, the study on ADP and related fields have gained much attention from various scholars [1], [2], [3], [4], [5], [6], [7], [8], [9], [10], [12], [13], [14], [16], [17], [18], [19], [20], [21], [22], [23], [24], [25], [28], [29], [30], [32], [33], [34], [36], [37], [38], [40], [41], [42], [44], [45], [46]. Lewis and Vrabie [13] stated that the ADP technique is closely related to the field of reinforcement learning. As is known to all, policy iteration is one of the basic algorithms of reinforcement learning. In addition, the initial admissible control is necessary when employing the policy iteration algorithm. However, in many situations, it is difficult to find the initial admissible control.

To the best of our knowledge, there are few results on robust optimal control of uncertain nonlinear systems based on ADP, not to mention the decentralized optimal control of large-scale systems. This is the motivation of our research. Actually, it is the first time that the robust optimal control scheme for a class of uncertain nonlinear systems via ADP technique and without using an initial admissible control is established. To begin with, the optimal controller of the nominal system is designed. It can be proved that the modification of optimal control law is in fact the robust controller of the original uncertain system, which also achieves optimality under the definition of a cost function. Then, a critic network is constructed for solving the HJB equation corresponding to the nominal system. In addition, inspired by the work of [5], [24], an additional stabilizing term is introduced to verify the stability, which relaxes the need for an initial stabilizing control. The uniform ultimate boundedness (UUB) of the closed-loop system is also proved via the Lyapunov approach. Furthermore, the aforementioned results are extended to deal with the decentralized optimal control for a class of continuous-time nonlinear interconnected systems. At last, two simulation examples are given to show the effectiveness of the robust optimal control scheme.

Section snippets

Problem statement and preliminaries

In this paper, we study the continuous-time uncertain nonlinear systems given by $\dot{x} (t) = f (x (t)) + g (x (t)) (\bar{u} (t) + \bar{d} (x (t))),$ where $x (t) \in R^{n}$ is the state vector and $\bar{u} (t) \in R^{m}$ is the control vector, $f (\cdot)$ and $g (\cdot)$ are differentiable in their arguments with $f (0) = 0$ , and $\bar{d} (x)$ is the unknown nonlinear perturbation. Let $x (0) = x_{0}$ be the initial state. We assume that $\bar{d} (0) = 0$ , so that $x = 0$ is an equilibrium of system (1). As in many other literature, for the nominal system $\dot{x} (t) = f (x (t)) + g (x (t)) u (t),$ we also assume

Robust optimal control design of uncertain nonlinear systems

In this section, for establishing the robust stabilizing control strategy of system (1), we modify the optimal control law (8) of system (2) by proportionally increasing a feedback gain, i.e., $\bar{u} (x) = ζ u^{*} (x) = - \frac{1}{2} ζ R^{- 1} g^{T} (x) \nabla J^{*} (x) .$ Now, we present the following lemma to indicate that the optimal control has infinite gain margin.

Lemma 1

For system (2), the feedback control given by (12) ensures that the closed-loop system is asymptotically stable for all $ζ ⩾ 1 / 2$ .

Proof

We show that $J^{*} (x)$ is a Lyapunov function. In light

Optimal control design via ADP approach and the stability proof

According to the universal approximation property of neural networks, $J^{*} (x)$ can be reconstructed by a single-layer neural network on a compact set $Ω$ as $J^{*} (x) = ω_{c}^{T} σ_{c} (x) + ε_{c} (x),$ where $ω_{c} \in R^{l}$ is the ideal weight, $σ_{c} (x) \in R^{l}$ is the activation function, l is the number of neurons in the hidden layer, and $ε_{c} (x)$ is the approximation error. Then, we have $\nabla J^{*} (x) = {(\nabla σ_{c} (x))}^{T} ω_{c} + \nabla ε_{c} (x) .$ Based on (24), the Lyapunov Eq. (4) becomes $0 = d_{M}^{2} (x) + u^{T} (x) Ru (x) + (ω_{c}^{T} \nabla σ_{c} (x) + {(\nabla ε_{c} (x))}^{T}) (f (x) + g (x) u (x)) .$ In light of [28], [4], [5], in

Decentralized optimal control design of nonlinear interconnected systems

Large-scale systems are common in engineering area when doing research on complex dynamical systems that can be partitioned into a set of interconnected subsystems. The decentralized control is one of the effective design approaches and has attracted a great amount of interest due to its advantages in easier implementation and lower dimensionality [17], [10], [26], [27], [39]. In this section, we generalize the aforementioned results to decentralized optimal control for a class of

Simulation studies

Two examples are provided in this section to demonstrate the effectiveness of the robust optimal control strategy.

Example 1

Consider the following continuous-time nonlinear system: $\dot{x} = [\begin{matrix} - 0.5 x_{1} + x_{2} (1 + 0.5 x_{2}^{2}) \\ - 0.8 (x_{1} + x_{2}) + 0.5 x_{2} (1 - 0.3 x_{2}^{2}) \end{matrix}] + [\begin{matrix} 0 \\ - 0.6 \end{matrix}] (\bar{u} + \bar{d} (x)),$ where $x = {[x_{1}, x_{2}]}^{T} \in R^{2}$ and $\bar{u} \in R$ are the state and control variables, respectively. The term $\bar{d} (x) = δ_{1} x_{2} \cos (δ_{2} x_{1} + δ_{3} x_{2})$ reflects the uncertainty of the controlled plant, where $δ_{1}$ , $δ_{2}$ , and $δ_{3}$ are unknown parameters with $δ_{1} \in [- 1, 1], δ_{2} \in [- 5, 5]$ , and $δ_{3} \in [- 3, 3]$ . We set $R = I$ and

Conclusion

A novel robust optimal control scheme for a class of uncertain nonlinear systems via ADP approach is developed in this paper. It is proved that the robust controller of the original uncertain system achieves optimality under a specified cost function. During the implementation process, a critic network is constructed to solve the HJB equation of the nominal system and an additional stabilizing term is introduced to verify the stability. The obtained results are also extended to design the

References (46)

M. Abu-Khalaf et al.
Nearly optimal control laws for nonlinear systems with saturating actuators using a neural network HJB approach
Automatica
(2005)
S. Bhasin et al.
A novel actor-critic-identifier architecture for approximate optimal control of uncertain nonlinear systems
Automatica
(2013)
Y. Jiang et al.
Computational adaptive optimal control for continuous-time linear systems with completely unknown dynamics
Automatica
(2012)
Z.P. Jiang et al.
Robust adaptive dynamic programming for linear and nonlinear systems: an overview
Eur. J. Control
(2013)
D. Liu et al.
Neural-network-based zero-sum game for discrete-time nonlinear systems via iterative adaptive dynamic programming algorithm
Neurocomputing
(2013)
D. Liu et al.
An iterative adaptive dynamic programming algorithm for optimal control of unknown discrete-time nonlinear systems with constrained inputs
Inform. Sci.
(2013)
H. Modares et al.
Integral reinforcement learning and experience replay for adaptive optimal control of partially-unknown constrained-input continuous-time systems
Automatica
(2014)
H. Modares et al.
A policy iteration approach to online optimal control of continuous-time constrained-input systems
ISA Trans.
(2013)
A. Saberi
On optimality of decentralized control for a class of nonlinear interconnected systems
Automatica
(1988)
K.G. Vamvoudakis et al.
Online actor-critic algorithm to solve the continuous-time infinite horizon optimal control problem
Automatica
(2010)

D. Wang et al.

Neuro-optimal control for a class of unknown nonlinear dynamic systems using SN-DHP technique

Neurocomputing

(2013)

D. Wang et al.

Finite-horizon neuro-optimal tracking control for a class of discrete-time nonlinear systems using adaptive dynamic programming approach

Neurocomputing

(2012)

D. Wang et al.

Optimal control of unknown nonaffine nonlinear discrete-time systems based on adaptive dynamic programming

Automatica

(2012)

H.N. Wu et al.

Simultaneous policy update algorithms for learning the solution of linear continuous-time $H_{\infty}$ state feedback control

Inform. Sci.

(2013)

D. Xu et al.

Decentralized measurement feedback stabilization of large-scale systems via control vector Lyapunov functions

Syst. Control Lett.

(2013)

X. Xu et al.

Reinforcement learning algorithms with function approximation: recent advances and applications

Inform. Sci.

(2014)

H.M. Yen et al.

Design of a robust neural network-based tracking controller for a class of electrically driven nonholonomic mechanical systems

Inform. Sci.

(2013)

A. Al-Tamimi et al.

Discrete-time nonlinear HJB solution using approximate dynamic programming: convergence proof

IEEE Trans. Syst. Man Cybernet. – Part B: Cybernet.

(2008)

D.P. Bertsekas et al.

Missile defense and interceptor allocation by neuro-dynamic programming

IEEE Trans. Syst. Man Cybernet. – Part A: Syst. Humans

(2000)

T. Dierks, S. Jagannathan, Optimal control of affine nonlinear continuous-time systems, in: Proceedings of the American...

T. Dierks et al.

Online optimal control of affine nonlinear discrete-time systems with unknown internal dynamics by using time-based policy update

IEEE Trans. Neural Netw. Learn. Syst.

(2012)

J. Fu et al.

Adaptive learning and control for MIMO system based on adaptive dynamic programming

IEEE Trans. Neural Netw.

(2011)

A. Heydari et al.

Finite-horizon control-constrained nonlinear optimal control using single network adaptive critics

IEEE Trans. Neural Netw. Learn. Syst.

(2013)

Cited by (139)

Dynamic event-triggered robust optimal tracking control for multi-player nonzero-sum games with mismatched uncertainties and asymmetric constrained inputs
2024, Information Sciences
In this paper, a novel dynamic event-triggered robust optimal tracking control method is proposed for multi-player nonzero-sum game problem with mismatched uncertainties and asymmetric constrained inputs. First, the original asymmetric constrained-input multi-player system is transformed into a new multi-player system with symmetric constrained inputs implemented by the state-space transformation technique, and an augmented system is established based on the new system and the desired trajectory. Next, an auxiliary augmentation system is constructed to handle the robust tracking control problem. Then, a critic neural network with a novel modified weight updating law is established for each player to solve the event-triggered coupled Hamilton–Jacobi equations, which relaxes the restrictions of initial admissible control and persistence of excitation condition. Furthermore, efficient communication and computing are obtained by the newly established dynamic event-triggered mechanism, and the Zeno behavior can be excluded by introducing an exponential term. The uniformly ultimately bounded properties of all signals are proved. Finally, two numerical examples verify the effectiveness of the designed method.
Nonzero-sum games using actor-critic neural networks: A dynamic event-triggered adaptive dynamic programming
2024, Information Sciences
This paper mainly investigates the nonzero-sum games of nonlinear systems with unmatched uncertainty by using actor-critic neural networks. To handle the unmatched components, an auxiliary system with a modified value function is constructed, which transforms the robust stabilization issue into the optimal control issue. Then, a novel dynamic event-triggering condition is designed to further save bandwidth via introducing a dynamic variable. In addition, the actor-critic algorithm is employed in adaptive dynamic programming to achieve Nash equilibrium, which is tuned together with the control policy. By constructing appropriate Lyapunov functions, a criterion is established to ensure that the considered system is uniformly ultimately bounded. Finally, the effectiveness of the developed strategy is demonstrated by an example.
Prescribed performance event-triggered fuzzy optimal tracking control for strict-feedback nonlinear systems
2024, Information Sciences
The prescribed performance event-triggered optimal tracking control problem is considered for a class of uncertain strict-feedback nonlinear systems with external disturbances. First, the disturbance observers are constructed to estimate external disturbances. Then, the controller contains an adaptive fuzzy controller and an optimal compensation term, in which the unknown nonlinearities and cost function are approximated by the fuzzy logic systems. An adaptive fuzzy controller is established by the dynamic surface control (DSC) technique, which is addressed to handle “computation complexity” issue occurred in conventional backstepping approach. Based on adaptive dynamic programming (ADP) method, an optimal compensation term is developed by minimizing the cost function. Furthermore, communication load is reduced via embedding event-triggered mechanism. In addition, the tracking error can be restricted in the prescribed region with the aid of the prescribed performance control. Thus, the whole control scheme can not only ensure that the tracking error converges to a boundary but also achieve optimization, reduce the communication burden and avoid “computation complexity” issue. The boundedness of all signals in the closed-loop is proved. Finally, the simulation examples illustrate the validity of the designed control scheme.
Decentralized optimal control of large-scale partially unknown nonlinear mismatched interconnected systems based on dynamic event-triggered control
2024, Neurocomputing
In this paper, a novel decentralized control method is proposed for nonlinear mismatched large-scale interconnected systems subjected to partially unknown dynamics by designing auxiliary control for each subsystem. It is demonstrated that the control sequence consisting of the optimal control policies of auxiliary control can stabilize the system asymptotically, leading to decentralized control of the large-scale system. An integral reinforcement learning (IRL) method is firstly proposed, replacing the traditional policy iterative algorithm to analyze the optimal control problem of each edge subsystem with partially unknown dynamics. After that, the edge-based dynamic event-triggered control algorithm is proposed based on the static event-triggered control method, and an internal dynamic variable characterized by a first-order filter is defined. A single critic neural network (NN) is then designed to learn the approximate optimal control strategy under the dynamic event-triggered mechanism adaptively. The stability analysis is proposed to demonstrate that the state of the event-based pulse system is ultimately uniformly bounded (UUB) and the Zeno behavior is eliminated successfully. Finally, the effectiveness of the proposed algorithm is verified by two simulation examples to realize decentralized control of mismatched large-scale systems.
Adaptive reinforcement learning optimal tracking control for strict-feedback nonlinear systems with prescribed performance
2023, Information Sciences
The reinforcement learning-based prescribed performance optimal tracking control problem is considered for a class of strict-feedback nonlinear systems in this paper. The unknown nonlinearities and cost function are approximated by radial-basis-function (RBF) neural network (NN). The overall controller consists of an adaptive controller and an optimal compensation term. Firstly, the adaptive controller is designed by backstepping control method. Subsequently, the optimal compensation term is derived via policy iteration by minimizing cost function. In addition, depending on the prescribed performance control, the tracking error can be limited in the prescribed area. Therefore, the whole control scheme can effectively guarantee that the tracking error converges to a bound with prescribed performance while the cost function is minimized. The stability analysis shows that all signals in the closed-loop system are bounded. Finally, the effectiveness and advantages of the designed control strategy are illustrated by the simulation examples.
Neural network-based safe optimal robust control for affine nonlinear systems with unmatched disturbances
2022, Neurocomputing
In this paper, for the safety–critical systems with unmatched disturbances, a safe optimal robust control method based on neural network is proposed to ensure that the safety–critical system operates within its safe region and learns the optimal control strategy. The cost function which is the goal of the designers is augmented by a control barrier function (CBF) to achieve both safety and optimality. This method does not directly regard security as a constraint on the system state, but influences the cost function through a security penalty mechanism. An additional function is used to approximate the effect of the unmatched disturbances on the safety–critical systems. On the premise of satisfying the security and robustness, the neural network approximation method is used to learn the optimal control strategy. Based on Lyapunov stability theory, it is shown that the neural network-based safe optimal robust controller can guarantee all the signals of the resulting closed-loop systems to be uniformly ultimately bounded. Finally, two simulation examples are given to demonstrate the effectiveness of the proposed method.

View all citing articles on Scopus

^☆: This work was supported in part by the National Natural Science Foundation of China under Grants 61034002, 61233001, 61273140, 61304086, and 61374105, in part by Beijing Natural Science Foundation under Grant 4132078, and in part by the Early Career Development Award of SKLMCCS.

View full text

Neural-network-based robust optimal control design for a class of uncertain nonlinear systems via adaptive dynamic programming☆

Abstract

Introduction

Section snippets

Problem statement and preliminaries

Robust optimal control design of uncertain nonlinear systems

Optimal control design via ADP approach and the stability proof

Decentralized optimal control design of nonlinear interconnected systems

Simulation studies

Conclusion

Automatica

Automatica

Automatica

Eur. J. Control

Neurocomputing

Inform. Sci.

Automatica

ISA Trans.

Automatica

Automatica

Neurocomputing

Neurocomputing

Automatica

Inform. Sci.

Syst. Control Lett.

Inform. Sci.

Inform. Sci.

Discrete-time nonlinear HJB solution using approximate dynamic programming: convergence proof

IEEE Trans. Syst. Man Cybernet. – Part B: Cybernet.

Missile defense and interceptor allocation by neuro-dynamic programming

IEEE Trans. Syst. Man Cybernet. – Part A: Syst. Humans

Online optimal control of affine nonlinear discrete-time systems with unknown internal dynamics by using time-based policy update

IEEE Trans. Neural Netw. Learn. Syst.

Adaptive learning and control for MIMO system based on adaptive dynamic programming

IEEE Trans. Neural Netw.

Finite-horizon control-constrained nonlinear optimal control using single network adaptive critics

IEEE Trans. Neural Netw. Learn. Syst.