Elsevier

Neurocomputing

Volume 537, 7 June 2023, Pages 187-197
Neurocomputing

Event-triggered near-optimal tracking control based on adaptive dynamic programming for discrete-time systems

https://doi.org/10.1016/j.neucom.2023.03.045Get rights and content

Abstract

Frequent state monitoring and controller updates can enhance the precision of tracking control, while simultaneously overburdening the communication network transmission. In this paper, For the purpose of saving communication costs, we propose event-triggered control algorithms for the optimal tracking control problem. First, we reconstruct the discrete-time nonlinear system into a converted system. Then, the adaptive dynamic programming algorithm is employed to find the optimal controller off-line, and the event-triggered scheme is used to reduce the communication costs online. Novel triggering conditions with fewer assumptions are designed to implement the event-triggered scheme. Different from existing works, the event-triggered scheme can be introduced not only into the converted system but also into the actual system, which is more practical because the actual controller is what one can only access in practice. In addition, with the developed algorithms, the tracking error can be proved to be stable at the origin, in other words, the actual system can be guaranteed to track the desired trajectory. Algorithms developed in this paper are implemented by three neural networks, the model network, the action network and the critic network. Finally, examples are presented to verify the effectiveness and rationality of the algorithms.

Introduction

With the onset of the big data era, conservation of resource consumption and communication costs is gradually the mainstream demand [1], [2]. However, equidistant sampling is always considered in traditional time-triggered control systems, which is a conservative way to guarantee the control performance and sacrifice communication costs [3]. In practice, most physical systems have finite-bandwidth between sensors, controllers, and actuators, such as multi-agent systems and wireless sensor actuator networks [4], [5], [6]. To trade off the performance against the costs, the event-triggered control (ETC) method is implemented by aperiodic sampling. The event-triggered controller is executed once the event is triggered [7]. Thus, ETC methods could reduce the traffic of information and save the communication resources [8]. Recently, many results on event-triggered approaches have emerged [9], [10], [11], [12], [13], [14], [15], [16], including event-triggered adaptive dynamic programming (ADP) methods [17], [18].

As a classical optimal control problem, the optimal tracking control problem [19], [20] can be solved by ADP methods efficiently [21], [22], [23], [24], [25], [26]. Many existing works [27], [28], [29] convert the optimal tracking control problem into an optimal regulation problem firstly, then solve the optimal regulation problem by ADP methods. In this paper, the system of the optimal tracking control problem is called the actual system, and the system of the optimal regulation problem is called the converted system. ADP is a self-learning method based on neural networks with the actor-critic structure [30], [31], [32], [33], which has been applied widely in practice, such as micro-grids, solar energy, and unmanned underwater vehicles [34], [35], [36], [37], [38], [39]. As data amounts expand, the necessity to economize on communication costs grows, and the event-triggered adaptive dynamic programming (ADP) algorithm appears [40], [41], [42], [43].

The event-triggered ADP method consists of two parts: off-line optimization and online control. First, the iterative ADP algorithms are employed to obtain the optimal control policy off-line. Then, we can execute the optimal controller online by the event-triggered scheme. Thus, the event-triggered ADP method is to balance the performance and costs based on demand by executing optimal control policy intermittently. The iterative ADP algorithm contains two groups: value iterations (VI) and policy iterations (PI), whose implementation process can be found in [23], [33]. Policy iteration has been demonstrated to make systems stable, but requires a known initial admissible control policy. Value iteration does not have the need for this condition, but the stability of the system is not easily guaranteed. Thus, we obtain the optimal control policy based on PI in this paper.

The event-triggered ADP method is an ideal tool to solve discrete-time optimal tracking control problems, which is with high requirement of accuracy and smooth communication. And several achievements have looked into this topic [44], [45], [46], [47]. Although many researchers have focused on the event-triggered ADP methods, most are in the continuous-time domain [46]. There are also realistic discrete-time control systems where the burden of communication needs to be reduced, such as scheduling problems, path planning problems, etc. Besides, most of the existing event-triggered optimal controllers designed for discrete-time systems are only applicable to the optimal regulated systems. Even in solving the optimal tracking problem, the optimal tracking problem is first converted to an optimal regulation problem, and then an event-triggered controller is designed for the converted system [44]. Only Lu et al. [45] proposed an event-triggered ADP algorithm for actual controller, however, their algorithm is only guaranteed that the tracking error is UUB, instead of stable at the origin. According to the above, an event-triggered algorithm which accommodates system stability and performance optimality is urgently needed to solve the discrete-time optimal tracking control problems. This inspired our research.

In this paper, two event-triggered ADP algorithms are developed, one applying the event-triggered mechanism to the converted controller and the other applying the event-triggered mechanism to the actual controller. The tracking error can be guaranteed to be stable at the origin, that is, as time tends to infinity, the state of actual system can reach the trajectory being tracked. We employ the ADP method to access the optimal controller off-line firstly. Then, to trade off the performance against the communication costs, we update the control input through the event-triggered scheme. The main contributions of this paper are:

  • 1) To solve the optimal tracking control problem with limited communication costs, we develop two event-triggered ADP algorithms. Different from existing works [44], [48], the event-triggered scheme can be incorporated both into the actual controller and into the converted controller, which is more practical.

  • 2) Novel triggering conditions are designed for the developed algorithms. Different from existing works [49], we diminish assumptions by inserting a new triggering condition. Thus, the feasibility of the event-triggered ADP algorithm is improved.

  • 3) Different from existing works [45], the stability of the tracking error at the origin is guaranteed based on the developed algorithms, consequently the actual system can be guaranteed to track the desired trajectory accurately.

The remaining contents are structured as follows. The formulation of the optimal tracking control problem, consisting of the system conversion and the optimal regulation problem, is introduced in Section II. In Section III, two event-triggered control algorithms are given, and the asymptotic stability is proved. In Section IV, the neural network implementation is declared. Besides, in Section V, we test these algorithms with two simulations. In Section VI, we conclude this paper.

Notation The transpose of a matrix A is denoted by AT and The inverse of the matrix is denoted by A-1 . R denotes the set of real numbers. Rn denotes the n-dimensional Euclidean space. · is the denotation of Euclidean norm of vectors or matrices. kjj=0 denotes a set containing k0,k1,k2,,k.

Section snippets

Optimal tracking control problems

Define a desired trajectory [48] to be tracked asβk+1=ψ(βk).Suppose the desired trajectory is bounded. Consider the actual system as a discrete-time nonlinear affine system, given byxk+1=f(xk)+g(xk)uk,where xkRn is the state of the actual system, and ukRm is the actual control. Assume that f and g are differentiable in their argument with f(0)=0, and g is invertible.

The purpose of the tracking control system is that the actual system can track the desired trajectory through manipulating the

Event-triggered near-optimal tracking control algorithms and stability analysis

In this section, we present two event-triggered near-optimal tracking control algorithms based on the ADP method. Each algorithm contains two parts, off-line optimization and online control. ADP algorithm is used to obtain the optimal control policy off-line during the optimization process, and we execute the optimal control policy by event-triggered scheme online during the control process. That is, the control input will be updated to the optimal one once the event is triggered. Although the

Neural network implementation

In this paper, neural networks are used to implement the algorithms. There are three main parts: the model network, the critic network, and the action network.

The weight matrix between the hidden layer and output layer is denoted as W, and the weight matrix between the input layer and hidden layer is denoted as Y. The activation function is defined as σ(dk)=(1-e-dk)/(1+e-dk). ζ is defined as the approximation error of neural networks. The learning rate is denoted as η. The superscripts m,a and c

Examples

Two simulations are employed to show the effectiveness of the methods studied.

Example.1 Considering the 2-DOF helicopter system in [51], a state-space description of the helicopter is given as follows,ẋ=ax+bu,where x=[θ,ψ,θ̇,ψ̇]TR4, u=[Fp,Fy]TR2. θ denotes the pitch angle, ψ denotes the yaw angle, Fp is the controller on the pitch axis, and Fy is the controller on the yaw axis. a and b are given asa=0010000100-Bp/JTp0000-By/JTy,b=0000Kpp/JTpKpy/JTpKyp/JTyKyy/JTy.The meaning of the parameters

Conclusion

In this paper, the event-triggered scheme is studied to reduce the communication demand for solving optimal tracking control problems. Two event-triggered algorithms based on ADP algorithms, ETCC and ETCA, are developed with novel triggering conditions. Compared with existing works, the developed algorithms are more feasible due to the lack of a hypothetical condition, and the triggering conditions proposed are easier to obtain without knowing the optimal value function for future moments.

CRediT authorship contribution statement

Ziyang Wang: Conceptualization, Methodology, Software, Validation, Investigation, Writing - original draft. Joonhyup Lee: Investigation, Writing - review & editing. Qinglai Wei: Conceptualization, Writing - review & editing. Anting Zhang: Investigation, Funding acquisition.

Declaration of Competing Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Ziyang Wang received the B.S. degree in automation, and Ph.D. degree in Control Science and Engineering, all from the University of Science and Technology Beijing, Beijing, China, in 2014 and 2020, respectively. From 2020 to 2022, he was a Postdoctoral Fellow with the Department of Automation, Tsinghua University, Beijing, China. He is currently an Assistant Professor with Xilingol Vocational College. His research interests include: adaptive dynamic programming, group decision-making, optimal

References (51)

  • Nowzaria Cameron et al.

    Event-triggered communication and control of networked systems for multi-agent consensus

    Automatica

    (2019)
  • Xianwei Li et al.

    Adaptive event-triggered consensus of multiagent systems on directed graphs

    IEEE Trans. Autom. Control

    (2021)
  • V.S. Dolk et al.

    Output-based and decentralized dynamic event-triggered control with guaranteed Lp- gain performance and zeno-freeness

    IEEE Trans. Autom. Control

    (2017)
  • Xian-Ming Zhang et al.

    An overview and deep investigation on sampled-data-based event-triggered control and filtering for networked systems

    IEEE Trans. Industr. Inf.

    (2017)
  • D.P. Borgers et al.

    Event-separation properties of event-triggered control systems

    IEEE Trans. Autom. Control

    (2014)
  • M.C.F. Donkers et al.

    Output-based event-triggered control with guaranteed L-gain and improved and decentralized event-triggering

    IEEE Trans. Autom. Control

    (2012)
  • Romain Postoyan et al.

    A framework for the event-triggered stabilization of nonlinear systems

    IEEE Trans. Autom. Control

    (2015)
  • Anton Selivanov et al.

    Event-triggered h control: A switching approach

    IEEE Trans. Autom. Control

    (2016)
  • Biao Luo et al.

    Event-triggered optimal control with performance guarantees using adaptive dynamic programming

    IEEE Trans. Neural Networks Learn. Syst.

    (2020)
  • Xiangnan Zhong et al.

    Grhdp solution for optimal consensus control of multiagent discrete-time systems

    IEEE Trans. Syst., Man, Cybern.: Syst.

    (2020)
  • Yi Chang et al.

    Switched-observer-based adaptive output-feedback control design with unknown gain for pure-feedback switched nonlinear systems via average dwell time

    Int. J. Syst. Sci.

    (2021)
  • Xiaoheng Chang et al.

    h tracking control design of t-s fuzzy systems

    Control Decision

    (2008)
  • P.J. Werbos

    Advanced forecasting methods for global crisis warning and models of intelligence

    General Syst. Yearbook

    (1977)
  • J.J. Murray et al.

    Adaptive dynamic programming

    IEEE Trans. Syst., Man, Cybern., Part C (Appl. Rev.)

    (2002)
  • A. Al-Tamimi et al.

    Discrete-time nonlinear HJB solution using approximate dynamic programming: Convergence proof

    IEEE Trans. Syst., Man, Cybern., Part B (Cybern.)

    (2008)
  • Cited by (0)

    Ziyang Wang received the B.S. degree in automation, and Ph.D. degree in Control Science and Engineering, all from the University of Science and Technology Beijing, Beijing, China, in 2014 and 2020, respectively. From 2020 to 2022, he was a Postdoctoral Fellow with the Department of Automation, Tsinghua University, Beijing, China. He is currently an Assistant Professor with Xilingol Vocational College. His research interests include: adaptive dynamic programming, group decision-making, optimal control and crowd intelligence science.

    Joonhyup Lee received the B.S. degree in automation, and M.S degree in Control Science and Engineering, all from Tsinghua University, Beijing, China, in 2019 and 2022, respectively. His research interests include: machine learning and group decision-making.

    Qinglai Wei received the B.S. degree in automation and the Ph.D. degree in control theory and control engineering from Northeastern University, Shenyang, China, in 2002 and 2009, respectively. From 2009 to 2011, he was a Postdoctoral Fellow with the State Key Laboratory of Management and Control for Complex Systems, Institute of Automation, Chinese Academy of Sciences, Beijing, China, where he is currently a Professor and the Associate Director. He has authored four books, and published over 80 international journal papers. His research interests include adaptive dynamic programming, neural-networks-based control, optimal control, nonlinear systems, and their industrial applications.

    Anting Zhang associate professor, graduated from Tsinghua University in 1996 with a master’s degree in automatic control theory and application. From 1996 to 2022, he was engaged in scientific research in the Department of Automation of Tsinghua University. His main research direction is enterprise informatization, e-commerce, big data, and crowd science.

    This work was supported by the National Key Research and Development Program of China (No. 2019YFB1404904)

    View full text