Elsevier

Neurocomputing

Volume 440, 14 June 2021, Pages 175-184
Neurocomputing

Event-triggered control for input constrained non-affine nonlinear systems based on neuro-dynamic programming

https://doi.org/10.1016/j.neucom.2021.01.116Get rights and content

Highlights

  • Extend the NDP-ETC to unknown non-affine CT constrained systems.

  • This method is not only suitable for affine systems, but non-affine systems.

  • Reduce the computational burden, communication resources and bandwidth.

Abstract

In this paper, a neuro-dynamic programming (NDP)-based event-triggered control (ETC) method is proposed for unknown non-affine nonlinear systems with input constraints. A neural network-based identifier is established with measurable input and output data to learn the unknown system dynamics. Then, a critic neural network is employed to approximate the value function for solving the event-triggered Hamilton-Jacobi-Bellman equation. Furthermore, an NDP-based ETC scheme is developed, which samples the states and updates the control law when the triggering condition is violated. Compared with the traditional time-triggered control methods, the ETC method can reduce computational burden, communication cost and bandwidth. In addition, the stability of the closed-loop system and the weight error convergence of the critic neural network are provided based on the Lyapunov’s direct method. The intersamling time is proved to be bounded by a positive constant, which excludes the Zeno behavior. Finally, two case studies are provided to verify the effectiveness of the developed ETC method.

Introduction

In the control research field, optimal controller design which leads the system operation to be optimal in practice, has attracted considerable interests in recent years. In order to tackle optimal control problems for nonlinear systems, we often need to solve the Hamilton-Jacobi-Bellman (HJB) equation by dynamic programming (DP) [1]. Due to its strong nonlinearity and the “curse of dimensionality”, the HJB equation is difficult or impossible to solve for complex nonlinear systems. Fortunately, reinforcement learning, neuro-dynamic programming (NDP), or adaptive dynamic programming (ADP), which are viewed as synonyms, tackle the aforementioned problem by forward computation to obtain approximate solutions to optimal control problems [2], [3], [4], [5], [6], [7], [8], [9], [10], [11], [12].

Over the past decades, many NDP-based methods have been reported to solve optimal control problems for continuous-time (CT) systems [13], discrete-time (DT) systems [14], [15], [16], [17] with trajectory tracking [18], [19], [20], external disturbances and uncertainties [21], [22], [23], [24], [25], [26], fault tolerance [27], [28], and so on. As it is well known, since the controlled plants are all explicitly displayed as affine dynamical systems [29], the optimal controller can be designed directly with the control input matrix. For example, in [19], adaptive tracking control method was investigated for a class of uncertain affine systems. In [23], the author provided an online learning scheme to solve robust stabilization problems for affine systems by using policy iteration, and they further extended this approach to design a robust optimal controller for unknown affine systems [24]. For non-affine dynamical systems, a neural network (NN) was constructed to approximate the unknown non-affine systems by using input and output data [20]. Furthermore, they proposed a data-driven robust approximate optimal tracking control method. Wang et al. [29] developed a robust intelligent critic control framework for non-affine systems. However, the aforementioned approaches are developed based on time-triggered control (TTC) mechanism which always requires transmitted data at a fixed sampling period and might result in heavy computational burden.

Compared with the fixed periodic sampling in TTC mechanism, the event-triggered control (ETC) method employs a triggering threshold to determine the time to sample states and update/execute actions. To mitigate unnecessary wastes of computational resources, the ETC has received great interest of many researchers recently in NDP- or ADP-based control fields [30], [31], [32], [33], [34], [35]. In [36], Dong et al. proposed an ETC scheme for unknown affine nonlinear DT systems. The unknown system state was estimated by a model NN, and the actor-critic framework was adopted to learn the ETC law and the value function. In [37], Wang et al. developed an ADP-based ETC method for multi-player affine DT systems. Multiple critic NNs were adopted to approximate different performance index functions, and multiple triggering conditions were presented for different controllers. For nonlinear CT systems, by introducing an NN observer, an ETC method for affine systems with unknown dynamics was developed in [38]. Wang et al. adopted ETC method to solve H control problems for affine systems [39], and further extended this approach to affine systems with unknown dynamics by using mixed data [40]. In [41], Yang and He developed a robust ETC method for unknown affine systems. The robust ETC problem was transformed into an optimal ETC problem by introducing an infinite-horizon integral cost function. Under the event-triggered framework, the designed controllers in aforementioned approaches were only updated when an event occurs. Hence, the computational burden, communication cost and bandwidth are all reduced.

The aforementioned NDP-based ETC methods were proposed for CT affine systems. However, many practical systems should be described in the non-affine form, which is impossible to investigate the NDP-based ETC method with the control input matrix similar to that of affine systems. On the other hand, the input constraint is inevitable in real applications, which may degrade the control performance. To solve this problem, Ha et al. developed a heuristic dynamic programming-based ETC method for unknown affine DT systems with input constraints [42]. In [43], by transforming the robust control problem into optimal control problem, an event-based constrained optimal control method was proposed by incorporating the NDP mechanism for input constrained affine systems. Therefore, it is urgent to develop a NDP-based ETC method for non-affine systems with input constraints. This greatly motivates our research.

Inspired by the aforementioned literature, in this paper, the optimal ETC for unknown non-affine systems with input constraints is investigated by using NDP technique. The novelties and contributions of this work are briefly emphasized in the following two aspects:

  • 1.

    Different from existing works which addressed optimal control problems for affine nonlinear systems with event-triggered mechanism [38], [41], [43], this paper extends the NDP-based ETC method for unknown non-affine CT systems with input constraints. The non-affine system is constructed to be an affine-like NN-based identifier by using measurable input and output data. By constructing an affine-like NN-based identifier, this method is suitable not only for affine systems, but also for non-affine systems.

  • 2.

    This scheme reduces the computational burden, communication cost and bandwidth, since the developed optimal controller is updated when an event is triggered and a single critic NN is adopted to approximate the value function. In addition, with the designed event-triggered threshold, the stability of closed-loop system and the convergence of critic NN weight error are both guaranteed.

The rest of this paper is organized as follows. In Section 2, the problem formulation for non-affine nonlinear CT systems is presented. In Section 3, the NN-based identifier is presented, the NDP-based ETC method is designed in detail, and stability proof is provided. In Section 4, two simulation examples are provided. Finally, in Section 5, conclusions are presented.

Section snippets

Problem formulation

Consider a class of non-affine nonlinear CT systems described byẋ(t)=F(x(t),u(t)),where x(t)Rn is the system state, u(t)Rm is the constrained control input that satisfies the conditions of |uq|<ub,q=1,2,,m, and ub is a positive constant.

Assumption 1

The system (1) is controllable, and the nonlinear function F(·,·) is Lipschitz continuous with F(0,0)=0 on a compact set ΩRn.

For system (1), the value function is defined asV(x(t))=tU(x(τ),u(τ))dτ,where U(x,u)=xTQx+M(u) is the utility function with U(0,0)=

Event-triggered controller design based on neuro-dynamic programming

In this section, an affine-like NN-based identifier is provided to approximate the unknown dynamics. Then, a novel event-triggered condition is presented. Moreover, a critic NN is employed to solve the HJB equation. Finally, the stability of the closed-loop system is proved.

Simulation studies

This section provides two examples to illustrate the effectiveness of the NDP-based ETC method for non-affine unknown nonlinear systems.

Conclusion

In this paper, an NDP-based ETC method is proposed for non-affine nonlinear systems with input constraints using measurable input and output data. An NN-based identifier is constructed to approximate the non-affine system. A critic NN is employed to solve HJB equation and the optimal ETC law is obtained. In order to reduce the computational burden, a new triggering condition is presented related to the system states. The control signal is updated only when the event-triggering condition is

CRediT authorship contribution statement

Shunchao Zhang: Methodology, Writing - original draft. Bo Zhao: Supervision. Yongwei Zhang: Writing - review & editing.

Declaration of Competing Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Shunchao Zhang received his B.S. degree in School of Electrical and Information Engineering from Hunan Institute of Engineering, Xiangtan, China, in 2016. He has received the M.S. degree in 2019 and is currently pursuing the Ph.D. degree in School of Automation from Guangdong University of Technology, Guangzhou, China. His current research interests include optimal control and adaptive dynamic programming.

References (45)

  • B. Zhao et al.

    Decentralized control for large-scale nonlinear systems with unknown mismatched interconnections via policy iteration

    IEEE Trans. Syst. Man Cybern. Part B Cybern.

    (2018)
  • Y. Zhu et al.

    Comprehensive comparison of online ADP algorithms for continuous-time optimal control

    Artif. Intell. Rev.

    (2018)
  • F.L. Lewis et al.

    Reinforcement learning and adaptive dynamic programming for feedback control

    IEEE Circuits Syst. Mag.

    (2009)
  • F. Wang et al.

    Adaptive dynamic programming: an introduction

    IEEE Comput. Intell. Mag.

    (2009)
  • B. Zhao et al.

    Reinforcement learning-based optimal stabilization for unknown nonlinear systems subject to inputs with uncertain constraints

    IEEE Trans. Neural Netw. and Learn. Syst.

    (2020)
  • B. Zhao et al.

    Sliding-mode surface-based approximate optimal control for uncertain nonlinear systems with asymptotically stable critic structure

    IEEE Trans. Cybern.

    (2020)
  • H. Wu et al.

    Neural network based online simultaneous policy update algorithm for solving the HJI equation in nonlinear H control

    IEEE Trans. Neural Netw. Learn. Syst.

    (2012)
  • B. Luo et al.

    Data-driven H control for nonlinear distributed parameter systems

    IEEE Trans. Neural Netw. Learn. Syst.

    (2015)
  • D. Liu et al.

    Adaptive optimal control for a class of continuous-time affine nonlinear systems with unknown internal dynamics

    Neural Comput. Appl.

    (2013)
  • D. Liu et al.

    Generalized policy iteration adaptive dynamic programming for discrete-time nonlinear systems

    IEEE Trans. Syst. Man Cybern. Syst.

    (2015)
  • D. Liu et al.

    Residential energy scheduling for variable weather solar energy based on adaptive dynamic programming

    IEEE/CAA J. Automat. Sin.

    (2018)
  • Q. Wei et al.

    Optimal constrained self-learning battery sequential management in microgrid via adaptive dynamic programming

    IEEE/CAA J. Automat. Sin.

    (2017)
  • Cited by (0)

    Shunchao Zhang received his B.S. degree in School of Electrical and Information Engineering from Hunan Institute of Engineering, Xiangtan, China, in 2016. He has received the M.S. degree in 2019 and is currently pursuing the Ph.D. degree in School of Automation from Guangdong University of Technology, Guangzhou, China. His current research interests include optimal control and adaptive dynamic programming.

    Bo Zhao received his B.S. degree in Automation, and Ph.D. degree in Control Science and Engineering, all from Jilin University, Changchun, China, in 2009 and 2014, respectively. He was a Post-Doctoral Fellow with the State Key Laboratory of Management and Control for Complex Systems, Institute of Automation, Chinese Academy of Sciences, Beijing, China, from 2014 to 2017. Then, he joined the State Key Laboratory of Management and Control for Complex Systems, Institute of Automation, Chinese Academy of Sciences, Beijing, China, from 2017 to 2018. He is currently an Associate Professor with the School of Systems Science, Beijing Normal University, Beijing, China. He has authored or coauthored over 90 journal and conference papers, and has held 3 patents. His research interests include adaptive dynamic programming, robot control, fault diagnosis and tolerant control, optimal control, and artificial intelligence-based control. He is the secretary of Adaptive Dynamic Programming and Reinforcement Learning Technical Committee of Chinese Automation Association (CAA), and was the secretary of 2017 the 24th International Conference on Neural Information Processing. He is IEEE Senior Member, Asian-Pacific Neural Network Society (APNNS) Member and CAA Member.

    Yongwei Zhang received his B.S. degree in School of Electronic and Information Engineering from Jiaying University, Meizhou, China, in 2016. He is currently pursuing his Ph.D. degree in School of Automation from Guangdong University of Technology, Guangzhou, China. His current research interests include optimal control and adaptive dynamic programming.

    This work was supported in part by the National Natural Science Foundation of China under Grants 61973330, 61773075, 62073085 and 61533017, in part by the Beijing Natural Science Foundation under Grant 4212038, in part by the Guangdong Introducing Innovative and Enterpreneurial Teams of “The Pearl River Talent Recruitment Program” 2019ZT08X340, and in part by the State Key Laboratory of Synthetical Automation for Process Industries under Grant 2019-KF-23-03.

    View full text