Event-triggered control for input constrained non-affine nonlinear systems based on neuro-dynamic programming☆
Introduction
In the control research field, optimal controller design which leads the system operation to be optimal in practice, has attracted considerable interests in recent years. In order to tackle optimal control problems for nonlinear systems, we often need to solve the Hamilton-Jacobi-Bellman (HJB) equation by dynamic programming (DP) [1]. Due to its strong nonlinearity and the “curse of dimensionality”, the HJB equation is difficult or impossible to solve for complex nonlinear systems. Fortunately, reinforcement learning, neuro-dynamic programming (NDP), or adaptive dynamic programming (ADP), which are viewed as synonyms, tackle the aforementioned problem by forward computation to obtain approximate solutions to optimal control problems [2], [3], [4], [5], [6], [7], [8], [9], [10], [11], [12].
Over the past decades, many NDP-based methods have been reported to solve optimal control problems for continuous-time (CT) systems [13], discrete-time (DT) systems [14], [15], [16], [17] with trajectory tracking [18], [19], [20], external disturbances and uncertainties [21], [22], [23], [24], [25], [26], fault tolerance [27], [28], and so on. As it is well known, since the controlled plants are all explicitly displayed as affine dynamical systems [29], the optimal controller can be designed directly with the control input matrix. For example, in [19], adaptive tracking control method was investigated for a class of uncertain affine systems. In [23], the author provided an online learning scheme to solve robust stabilization problems for affine systems by using policy iteration, and they further extended this approach to design a robust optimal controller for unknown affine systems [24]. For non-affine dynamical systems, a neural network (NN) was constructed to approximate the unknown non-affine systems by using input and output data [20]. Furthermore, they proposed a data-driven robust approximate optimal tracking control method. Wang et al. [29] developed a robust intelligent critic control framework for non-affine systems. However, the aforementioned approaches are developed based on time-triggered control (TTC) mechanism which always requires transmitted data at a fixed sampling period and might result in heavy computational burden.
Compared with the fixed periodic sampling in TTC mechanism, the event-triggered control (ETC) method employs a triggering threshold to determine the time to sample states and update/execute actions. To mitigate unnecessary wastes of computational resources, the ETC has received great interest of many researchers recently in NDP- or ADP-based control fields [30], [31], [32], [33], [34], [35]. In [36], Dong et al. proposed an ETC scheme for unknown affine nonlinear DT systems. The unknown system state was estimated by a model NN, and the actor-critic framework was adopted to learn the ETC law and the value function. In [37], Wang et al. developed an ADP-based ETC method for multi-player affine DT systems. Multiple critic NNs were adopted to approximate different performance index functions, and multiple triggering conditions were presented for different controllers. For nonlinear CT systems, by introducing an NN observer, an ETC method for affine systems with unknown dynamics was developed in [38]. Wang et al. adopted ETC method to solve control problems for affine systems [39], and further extended this approach to affine systems with unknown dynamics by using mixed data [40]. In [41], Yang and He developed a robust ETC method for unknown affine systems. The robust ETC problem was transformed into an optimal ETC problem by introducing an infinite-horizon integral cost function. Under the event-triggered framework, the designed controllers in aforementioned approaches were only updated when an event occurs. Hence, the computational burden, communication cost and bandwidth are all reduced.
The aforementioned NDP-based ETC methods were proposed for CT affine systems. However, many practical systems should be described in the non-affine form, which is impossible to investigate the NDP-based ETC method with the control input matrix similar to that of affine systems. On the other hand, the input constraint is inevitable in real applications, which may degrade the control performance. To solve this problem, Ha et al. developed a heuristic dynamic programming-based ETC method for unknown affine DT systems with input constraints [42]. In [43], by transforming the robust control problem into optimal control problem, an event-based constrained optimal control method was proposed by incorporating the NDP mechanism for input constrained affine systems. Therefore, it is urgent to develop a NDP-based ETC method for non-affine systems with input constraints. This greatly motivates our research.
Inspired by the aforementioned literature, in this paper, the optimal ETC for unknown non-affine systems with input constraints is investigated by using NDP technique. The novelties and contributions of this work are briefly emphasized in the following two aspects:
- 1.
Different from existing works which addressed optimal control problems for affine nonlinear systems with event-triggered mechanism [38], [41], [43], this paper extends the NDP-based ETC method for unknown non-affine CT systems with input constraints. The non-affine system is constructed to be an affine-like NN-based identifier by using measurable input and output data. By constructing an affine-like NN-based identifier, this method is suitable not only for affine systems, but also for non-affine systems.
- 2.
This scheme reduces the computational burden, communication cost and bandwidth, since the developed optimal controller is updated when an event is triggered and a single critic NN is adopted to approximate the value function. In addition, with the designed event-triggered threshold, the stability of closed-loop system and the convergence of critic NN weight error are both guaranteed.
Section snippets
Problem formulation
Consider a class of non-affine nonlinear CT systems described bywhere is the system state, is the constrained control input that satisfies the conditions of , and is a positive constant. Assumption 1 The system (1) is controllable, and the nonlinear function is Lipschitz continuous with on a compact set .
For system (1), the value function is defined aswhere is the utility function with
Event-triggered controller design based on neuro-dynamic programming
In this section, an affine-like NN-based identifier is provided to approximate the unknown dynamics. Then, a novel event-triggered condition is presented. Moreover, a critic NN is employed to solve the HJB equation. Finally, the stability of the closed-loop system is proved.
Simulation studies
This section provides two examples to illustrate the effectiveness of the NDP-based ETC method for non-affine unknown nonlinear systems.
Conclusion
In this paper, an NDP-based ETC method is proposed for non-affine nonlinear systems with input constraints using measurable input and output data. An NN-based identifier is constructed to approximate the non-affine system. A critic NN is employed to solve HJB equation and the optimal ETC law is obtained. In order to reduce the computational burden, a new triggering condition is presented related to the system states. The control signal is updated only when the event-triggering condition is
CRediT authorship contribution statement
Shunchao Zhang: Methodology, Writing - original draft. Bo Zhao: Supervision. Yongwei Zhang: Writing - review & editing.
Declaration of Competing Interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
Shunchao Zhang received his B.S. degree in School of Electrical and Information Engineering from Hunan Institute of Engineering, Xiangtan, China, in 2016. He has received the M.S. degree in 2019 and is currently pursuing the Ph.D. degree in School of Automation from Guangdong University of Technology, Guangzhou, China. His current research interests include optimal control and adaptive dynamic programming.
References (45)
- et al.
Neural-network-based control scheme for a class of nonlinear systems with actuator faults via data-driven reinforcement learning method
Neurocomputing
(2017) - et al.
Neural network robust tracking control with adaptive critic framework for uncertain nonlinear systems
Neural Netw.
(2018) - et al.
Robust control design for multi-player nonlinear systems with input disturbances via adaptive dynamic programming
Neurocomputing
(2019) - et al.
Bounded robust control design for uncertain nonlinear systems using single-network adaptive dynamic programming
Neurocomputing
(2017) - et al.
Data-based robust optimal control of continuous-time affine nonlinear systems with matched uncertainties
Inf. Sci.
(2016) - et al.
Event-driven optimal control for uncertain nonlinear systems with external disturbance via adaptive dynamic programming
Neurocomputing
(2018) - et al.
Event-triggered adaptive dynamic programming for discrete-time multi-player games
Inf. Sci.
(2020) Dynamic Programming and Optimal Control
(1995)- et al.
Adaptive Dynamic Programming with Applications in Optimal Control
(2017) - et al.
Iterative ADP learning algorithms for discrete-time multi-player games
Artif. Intell. Rev.
(2018)
Decentralized control for large-scale nonlinear systems with unknown mismatched interconnections via policy iteration
IEEE Trans. Syst. Man Cybern. Part B Cybern.
Comprehensive comparison of online ADP algorithms for continuous-time optimal control
Artif. Intell. Rev.
Reinforcement learning and adaptive dynamic programming for feedback control
IEEE Circuits Syst. Mag.
Adaptive dynamic programming: an introduction
IEEE Comput. Intell. Mag.
Reinforcement learning-based optimal stabilization for unknown nonlinear systems subject to inputs with uncertain constraints
IEEE Trans. Neural Netw. and Learn. Syst.
Sliding-mode surface-based approximate optimal control for uncertain nonlinear systems with asymptotically stable critic structure
IEEE Trans. Cybern.
Neural network based online simultaneous policy update algorithm for solving the HJI equation in nonlinear control
IEEE Trans. Neural Netw. Learn. Syst.
Data-driven control for nonlinear distributed parameter systems
IEEE Trans. Neural Netw. Learn. Syst.
Adaptive optimal control for a class of continuous-time affine nonlinear systems with unknown internal dynamics
Neural Comput. Appl.
Generalized policy iteration adaptive dynamic programming for discrete-time nonlinear systems
IEEE Trans. Syst. Man Cybern. Syst.
Residential energy scheduling for variable weather solar energy based on adaptive dynamic programming
IEEE/CAA J. Automat. Sin.
Optimal constrained self-learning battery sequential management in microgrid via adaptive dynamic programming
IEEE/CAA J. Automat. Sin.
Cited by (0)
Shunchao Zhang received his B.S. degree in School of Electrical and Information Engineering from Hunan Institute of Engineering, Xiangtan, China, in 2016. He has received the M.S. degree in 2019 and is currently pursuing the Ph.D. degree in School of Automation from Guangdong University of Technology, Guangzhou, China. His current research interests include optimal control and adaptive dynamic programming.
Bo Zhao received his B.S. degree in Automation, and Ph.D. degree in Control Science and Engineering, all from Jilin University, Changchun, China, in 2009 and 2014, respectively. He was a Post-Doctoral Fellow with the State Key Laboratory of Management and Control for Complex Systems, Institute of Automation, Chinese Academy of Sciences, Beijing, China, from 2014 to 2017. Then, he joined the State Key Laboratory of Management and Control for Complex Systems, Institute of Automation, Chinese Academy of Sciences, Beijing, China, from 2017 to 2018. He is currently an Associate Professor with the School of Systems Science, Beijing Normal University, Beijing, China. He has authored or coauthored over 90 journal and conference papers, and has held 3 patents. His research interests include adaptive dynamic programming, robot control, fault diagnosis and tolerant control, optimal control, and artificial intelligence-based control. He is the secretary of Adaptive Dynamic Programming and Reinforcement Learning Technical Committee of Chinese Automation Association (CAA), and was the secretary of 2017 the 24th International Conference on Neural Information Processing. He is IEEE Senior Member, Asian-Pacific Neural Network Society (APNNS) Member and CAA Member.
Yongwei Zhang received his B.S. degree in School of Electronic and Information Engineering from Jiaying University, Meizhou, China, in 2016. He is currently pursuing his Ph.D. degree in School of Automation from Guangdong University of Technology, Guangzhou, China. His current research interests include optimal control and adaptive dynamic programming.
- ☆
This work was supported in part by the National Natural Science Foundation of China under Grants 61973330, 61773075, 62073085 and 61533017, in part by the Beijing Natural Science Foundation under Grant 4212038, in part by the Guangdong Introducing Innovative and Enterpreneurial Teams of “The Pearl River Talent Recruitment Program” 2019ZT08X340, and in part by the State Key Laboratory of Synthetical Automation for Process Industries under Grant 2019-KF-23-03.