Skip to main content
Log in

Adaptive optimal safety tracking control for multiplayer mixed zero-sum games of continuous-time systems

  • Published:
Applied Intelligence Aims and scope Submit manuscript

Abstract

When the equipment is working, it is very important to avoid the occurrence of malignant accidents by providing a highly reliable safety protection means. In this paper, for multiplayer mixed zero-sum games, an optimal safety tracking control scheme based on adaptive dynamic programming (ADP) is proposed, and a control barrier function (CBF) is introduced into the value function of the system to ensure that the system operates within its safe region. Firstly, through system transformation, the original tracking problem is transformed into a state tracking error problem. Secondly, an augmented Hamilton-Jacobi-Bellman (HJB) equation is derived from the improved augmented error system and the value function. Different from traditional methods, this paper uses a single critic neural network (NN) instead of the actor-critic NN to approximate the Nash equilibrium solution of the system, and introduces a concurrent learning technique that can relax the traditional continuous excitation condition into a simplified condition of recording data. Then, according to the Lyapunov theory, the stability of the system is analyzed in detail. Finally, two simulation examples are used to verify the effectiveness of the proposed scheme.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Algorithm 1
Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16
Fig. 17

Similar content being viewed by others

Data Availability

The authors can confirm that all relevant data are included in the article.

References

  1. Yarlagadda J, Jain P, Pawar SD (2021) Assessing safety critical driving patterns of heavy passenger vehicle drivers using instrumented vehicle data – An unsupervised approach. Accident Analysis & Prevention 163:106464

    Article  Google Scholar 

  2. Nguyen Q, Sreenath K (2022) Robust safety-critical control for dynamic robotics. IEEE Trans Autom Control 67(3):1073–1088

    Article  MathSciNet  MATH  Google Scholar 

  3. Singletary A, Kolathaya S, Ames A (2022) Safety-critical kinematic control of robotic systems. IEEE Control Systems Letters 6:139–144

    Article  MathSciNet  Google Scholar 

  4. Ames A, Xu X, Grizzle J, Tabuada P (2017) Control barrier function based quadratic programs for safety critical systems. IEEE Trans Autom Control 62(8):3861–3876

    Article  MathSciNet  MATH  Google Scholar 

  5. Wang L, Han D, Egerstedt M (2018) Permissive barrier certificates for safe stabilization using sum-of-squares. In: 2018 Annual American Control Conference (ACC). Milwaukee, pp 585-590

  6. Cohen M, Belta C (2020) Approximate optimal control for safety-critical systems with control barrier functions. In: 2020 59th IEEE conference on decision and control (CDC), pp 2062-2067

  7. Marvi Z, Kiumarsi B (2020) Safe reinforcement learning: a control barrier function optimization approach. Int J Robust Nonlinear Control 31(6):1923–1940

    Article  MathSciNet  Google Scholar 

  8. Panagou D, Stipanovic D, Voulgaris P (2016) Distributed coordination control for multi-robot networks using Lyapunov-like barrier functions. IEEE Trans Autom Control 61(3):617–632

    Article  MathSciNet  MATH  Google Scholar 

  9. Wang L, Ames A, Egerstedt M (2017) Safety barrier certificates for collisions-free multirobot systems. IEEE Trans Robot 33(3):661–674

    Article  Google Scholar 

  10. Wang H, Peng J, Zhang F, Zhang H, Wang Y (2022) High-order control barrier functions-based impedance control of a robotic manipulator with time-varying output constraints. ISA Transactions

  11. Wang L, Ames A, Egerstedt M (2017) Safety barrier certificates for collisions-free multirobot systems. IEEE Trans Robot 33(3):661–674

    Article  Google Scholar 

  12. Yao F, Yin B, Chen J (2021) Barrier Lyapunov function based adaptive region tracking control for underwater vehicles with thruster saturation and dead zone. J Frankl Inst 358(11):5820–5844

    Article  MathSciNet  MATH  Google Scholar 

  13. Zhang H, Su H, Zhang K, Luo Y (2019) Event-triggered adaptive dynamic programming for non-zero-sum games of unknown nonlinear systems via generalized fuzzy hyperbolic models. IEEE Trans Fuzzy Syst 27(11):2202–2214

    Article  Google Scholar 

  14. Zhu Y, Zhao D, Li X (2017) Iterative adaptive dynamic programming for solving unknown nonlinear zero-sum game based on online data. IEEE Transactions on Neural Networks and Learning Systems 28 (3):714–725

    Article  MathSciNet  Google Scholar 

  15. Wei Q, Liu D, Lin Q, Song R (2018) Adaptive dynamic programming for discrete-time zero-sum games. IEEE Transactions on Neural Networks and Learning Systems 29(4):957–969

    Article  Google Scholar 

  16. Zhang Y, Zhao B, Liu D, Zhang S (2022) Event-triggered control of discrete-time zero-sum games via deterministic policy gradient adaptive dynamic programming. IEEE Transactions on Systems Man, and Cybernetics: Systems 52(8):4823–4835

    Article  Google Scholar 

  17. Lv Y, Ren X (2018) Approximate Nash solutions for multiplayer mixed-zero-sum game with reinforcement learning. IEEE Transactions on Systems Man, and Cybernetics: Systems 49(12):2739–2750

    Article  Google Scholar 

  18. Song R, Du K (2020) Mix-zero-sum differential games for linear systems with unknown dynamics based on off-policy IRL. Neurocomputing 398:280–290

    Article  Google Scholar 

  19. Liu D, Xue S, Zhao B, Luo B, Wei Q (2021) Adaptive dynamic programming for control: a survey and recent advances. IEEE Transactions on Systems Man, and Cybernetics: Systems 51:142–160

    Article  Google Scholar 

  20. Wang D, He H, Liu D (2017) Adaptive critic nonlinear robust control: a survey. IEEE Trans Cybern 47(10):3429–3451

    Article  Google Scholar 

  21. Yasini S, Sistani M, Karimpour A (2015) Approximate dynamic programming for two-player zero-sum game related to H \(\infty \) control of unknown nonlinear continuous-time systems. International Journal of Control Automation and Systems 13:99–109

    Article  MATH  Google Scholar 

  22. Long T, Cao Y, Sun J, Xu G (2021) Adaptive event-triggered distributed optimal guidance design via adaptive dynamic programming. Chin J Aeronaut 35(7):113–127

    Article  Google Scholar 

  23. Zhao S, Wang J, Xu H, Wang H (2022) Finite horizon robust optimal tracking control based on approximate dynamic programming for switched systems with uncertainties. International Journal of Control Automation and Systems 20:1051–1062

    Article  Google Scholar 

  24. Liu P, Zhang H, Ren H, Liu C (2021) Online event-triggered adaptive critic design for multi-player zero-sum games of partially unknown nonlinear systems with input constraints. Neurocomputing 462:309–319

    Article  Google Scholar 

  25. Vepa R (2022) Feedback tracking control of optimal reference trajectories for spacecraft relative motion. Adv Space Res 69(9):3478–3489

    Article  Google Scholar 

  26. Nie W, Li H, Zhang R (2020) Model-free adaptive optimal design for trajectory tracking control of rocket-powered vehicle. Chin J Aeronaut 33(6):1703–1716

    Article  Google Scholar 

  27. Xia Y, Xu K, Wang W, Xu G, Xiang X, Li Y (2020) Optimal robust trajectory tracking control of a X-rudder AUV with velocity sensor failures and uncertainties. Ocean Eng 198:106949

    Article  Google Scholar 

  28. Liu P, Zhang H, Su H, Ren H (2021) Online event-based adaptive critic design with experience replay to solve partially unknown multi-player nonzero-sum games. Neurocomputing 458:219–231

    Article  Google Scholar 

  29. Zhang Y, Wang D, Yin Y, Peng Z (2021) Event-triggered distributed coordinated control of networked autonomous surface vehicles subject to fully unknown kinetics via concurrent-learning-based neural predictor. Ocean Eng 234:108966

    Article  Google Scholar 

  30. Bhasin S, Kamalapurkar R, Johnson M, Vamvoudakis K, Lewis F, Dixon W (2013) A novel actor–critic–identifier architecture for approximate optimal control of uncertain nonlinear systems. Automatica 49(1):82–92

    Article  MathSciNet  MATH  Google Scholar 

  31. Wang D, Mu C (2018) Adaptive-critic-based robust trajectory tracking of uncertain dynamics and its application to a spring–mass–damper system. IEEE Trans Ind Electron 65(1):654–663

    Article  MathSciNet  Google Scholar 

  32. Liu H, Cheng Q, Xiao J, Hao L (2022) Data-driven optimal tracking control for SMA actuated systems with prescribed performance via reinforcement learning. Mech Syst Signal Process 177:109191

    Article  Google Scholar 

  33. Vamvoudakis K, Lewis F (2010) Online actor–critic algorithm to solve the continuous-time infinite horizon optimal control problem. Automatica 46(5):878–888

    Article  MathSciNet  MATH  Google Scholar 

  34. Abu-Khalaf M, Lewis F (2005) Nearly optimal control laws for non- linear systems with saturating actuators using a neural network HJB approach. Automatica 41(5):779–791

    Article  MathSciNet  MATH  Google Scholar 

  35. Satoh Y, Iwashita M, Sakata O (2021) Robust adaptive trajectory tracking of nonlinear systems based on input-to-state stability tracking control lyapunov functions. IFAC-PapersOnLine 54(14):388–393

    Article  Google Scholar 

Download references

Acknowledgements

This work was supported by Youth Backbone Teachers in Colleges and Universities of Henan Province (2018GGJS017), and Science and Technology Research Project of the Henan Province (222102240014).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Dehua Zhang.

Ethics declarations

Conflict of Interests

The authors declare that they have no conflict of interest.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Qin, C., Zhang, Z., Shang, Z. et al. Adaptive optimal safety tracking control for multiplayer mixed zero-sum games of continuous-time systems. Appl Intell 53, 17460–17475 (2023). https://doi.org/10.1007/s10489-022-04348-9

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10489-022-04348-9

Keywords

Navigation