Abstract
This paper presents a novel optimal synchronization control method for multi-agent systems with input saturation. The multi-agent game theory is introduced to transform the optimal synchronization control problem into a multi-agent nonzero-sum game. Then, the Nash equilibrium can be achieved by solving the coupled Hamilton—Jacobi—Bellman (HJB) equations with nonquadratic input energy terms. A novel off-policy reinforcement learning method is presented to obtain the Nash equilibrium solution without the system models, and the critic neural networks (NNs) and actor NNs are introduced to implement the presented method. Theoretical analysis is provided, which shows that the iterative control laws converge to the Nash equilibrium. Simulation results show the good performance of the presented method.
摘要
本文针对输入饱和下的多智能体系统, 提出一种最优一致性控制方法。引入多智能体博弈理论, 将最优一致性控制问题转化为多智能体非零和博弈。之后, 通过求解具有非二次输入能量项的耦合Hamilton—Jacobi—Bellman(HJB)方程, 实现Nash平衡。提出脱策强化学习方法, 在系统模型未知情况下获得Nash平衡解;引入评判神经网络和执行神经网络实现所提方法。理论分析显示迭代控制律收敛到Nash平衡。仿真实验验证了所提方法的有效性。
Access this article
We’re sorry, something doesn't seem to be working properly.
Please try refreshing the page. If that doesn't work, please contact support so we can address the problem.
Similar content being viewed by others
References
Abu-Khalaf M, Lewis FL, 2005. Nearly optimal control laws for nonlinear systems with saturating actuators using a neural network HJB approach. Automatica, 41(5):779–791. https://doi.org/10.1016/j.automatica.2004.11.034
Bertsekas DP, 2007. Dynamic Programming and Optimal Control. Athena Scientific, Belmont, USA.
Cao MT, Xiao F, Wang L, 2015. Event-based second-order consensus control for multi-agent systems via synchronous periodic event detection. IEEE Trans Autom Contr, 60(9):2452–2457. https://doi.org/10.1109/TAC.2015.2390553
Du HB, He YG, Cheng YY, 2014. Finite-time synchronization of a class of second-order nonlinear multi-agent systems using output feedback control. IEEE Trans Circ Syst I, 61(6):1778–1788. https://doi.org/10.1109/TCSI.2013.2295012
Garcia E, Cao YC, Casbeer D, 2017. Periodic event-triggered synchronization of linear multi-agent systems with communication delays. IEEE Trans Autom Contr, 62(1):366–371. https://doi.org/10.1109/TAC.2016.2555484
Han YJ, Lu WL, Chen TP, 2013. Cluster consensus in discrete-time networks of multi-agents with inter-cluster nonidentical inputs. IEEE Trans Neur Netw Learn Syst, 24(4):566–578. https://doi.org/10.1109/TNNLS.2013.2237786
He WL, Gao XY, Zhong WM, et al., 2018. Secure impulsive synchronization control of multi-agent systems under deception attacks. Inform Sci, 459:354–368. https://doi.org/10.1016/j.ins.2018.04.020
Jiao Q, Modares H, Xu SY, et al., 2016. Multi-agent zero-sum differential graphical games for disturbance rejection in distributed control. Automatica, 69:24–34. https://doi.org/10.1016/j.automatica.2016.02.002
Li JN, Modares H, Chai TY, et al., 2017. Off-policy reinforcement learning for synchronization in multiagent graphical games. IEEE Trans Neur Netw Learn Syst, 28(10):2434–2445. https://doi.org/10.1109/TNNLS.2016.2609500
Li JQ, Wang QL, Su YX, et al., 2021. Robust distributed model predictive consensus of discrete-time multi-agent systems: a self-triggered approach. Front Inform Technol Electron Eng, 22(8):1068–1079. https://doi.org/10.1631/FITEE.2000182
Liu DR, Xue S, Zhao B, et al., 2021. Adaptive dynamic programming for control: a survey and recent advances. IEEE Trans Syst Man Cybern Syst, 51(1):142–160. https://doi.org/10.1109/TSMC.2020.3042876
Ma HJ, Yang GH, 2016. Adaptive fault tolerant control of cooperative heterogeneous systems with actuator faults and unreliable interconnections. IEEE Trans Autom Contr, 61(11):3240–3255. https://doi.org/10.1109/TAC.2015.2507864
Qin JH, Li M, Shi Y, et al., 2019. Optimal synchronization control of multiagent systems with input saturation via off-policy reinforcement learning. IEEE Trans Neur Netw Learn Syst, 30(1):85–96. https://doi.org/10.1109/TNNLS.2018.2832025
Rehák B, Lynnyk V, 2021. Leader-following synchronization of a multi-agent system with heterogeneous delays. Front Inform Technol Electron Eng, 22(1):97–106. https://doi.org/10.1631/FITEE.2000207
Thunberg J, Song W, Monitijano E, et al., 2014. Distributed attitude synchronization control of multi-agent systems with switching topologies. Automatica, 50(3):832–840. https://doi.org/10.1016/j.automatica.2014.02.002
Vamvoudakis KG, Lewis FL, Hudas GR, 2012. Multi-agent differential graphical games: online adaptive learning solution for synchronization with optimality. Automatica, 48(8):1598–1611. https://doi.org/10.1016/j.automatica.2012.05.074
Vrabie D, Lewis F, 2011. Adaptive dynamic programming for online solution of a zero-sum differential game. J Contr Theory Appl, 9(3):353–360. https://doi.org/10.1007/s11768-011-0166-4
Wang FY, Zhang HG, Liu DR, 2009. Adaptive dynamic programming: an introduction. IEEE Comput Intell Mag, 4(2):39–47. https://doi.org/10.1109/MCI.2009.932261
Wei QL, Liu DR, 2014. Adaptive dynamic programming for optimal tracking control of unknown nonlinear systems with application to coal gasification. IEEE Trans Autom Sci Eng, 11(4):1020–1036. https://doi.org/10.1109/TASE.2013.2284545
Wei QL, Wang FY, Liu DR, et al., 2014. Finite-approximation-error-based discrete-time iterative adaptive dynamic programming. IEEE Trans Cybern, 44(12):2820–2833. https://doi.org/10.1109/TCYB.2014.2354377
Wei QL, Liu DR, Lewis FL, 2015. Optimal distributed synchronization control for continuous-time heterogeneous multi-agent differential graphical games. Inform Sci, 317:96–113. https://doi.org/10.1016/j.ins.2015.04.044
Wei QL, Liu DR, Lin HQ, 2016. Value iteration adaptive dynamic programming for optimal control of discrete-time nonlinear systems. IEEE Trans Cybern, 46(3):840–853. https://doi.org/10.1109/TCYB.2015.2492242
Wei QL, Lewis FL, Sun QY, et al., 2017. Discrete-time deterministic Q-learning: a novel convergence analysis. IEEE Trans Cybern, 47(5):1224–1237. https://doi.org/10.1109/TCYB.2016.2542923
Wei QL, Lewis FL, Liu DR, et al., 2018. Discrete-time local value iteration adaptive dynamic programming: convergence analysis. IEEE Trans Syst Man Cybern Syst, 48(6):875–891. https://doi.org/10.1109/TSMC.2016.2623766
Wei QL, Li HY, Wang FY, 2020. Parallel control for continuous-time linear systems: a case study. IEEE/CAA J Autom Sin, 7(4):919–928. https://doi.org/10.1109/JAS.2020.1003216
Wei QL, Wang X, Zhong XN, et al., 2021. Consensus control of leader-following multi-agent systems in directed topology with heterogeneous disturbances. IEEE/CAA J Autom Sin, 8(2):423–431. https://doi.org/10.1109/JAS.2021.1003838
Wieland P, Sepulchre R, Allgöwer F, 2011. An internal model principle is necessary and sufficient for linear output synchronization. Automatica, 47(5):1068–1074. https://doi.org/10.1016/j.automatica.2011.01.081
Yang JY, Xi F, Ma J, 2019. Model-based edge-event-triggered containment control under directed topologies. IEEE Trans Cybern, 49(7):2556–2567. https://doi.org/10.1109/TCYB.2018.2828645
Yang N, Xiao JW, Xiao L, et al., 2019. Non-zero sum differential graphical game: cluster synchronisation for multi-agents with partially unknown dynamics. Int J Contr, 92(10):2408–2419. https://doi.org/10.1080/00207179.2018.1441550
Zhang HG, Zhang JL, Yang GH, et al., 2015. Leader-based optimal coordination control for the consensus problem of multiagent differential games via fuzzy adaptive dynamic programming. IEEE Trans Fuzzy Syst, 23(1):152–163. https://doi.org/10.1109/TFUZZ.2014.2310238
Zhang KQ, Yang ZR, Başar T, 2021. Decentralized multi-agent reinforcement learning with networked agents: recent advances. Front Inform Technol Electron Eng, 22(6):802–814. https://doi.org/10.1631/FITEE.1900661
Zhang LD, Wang B, Liu ZX, et al., 2019. Motion planning of a quadrotor robot game using a simulation-based projected policy iteration method. Front Inform Technol Electron Eng, 20(4):525–537. https://doi.org/10.1631/FITEE.1800571
Zhao DY, Zhu QM, Li N, et al., 2014. Synchronized control with neuro-agents for leader-follower based multiple robotic manipulators. Neurocomputing, 124:149–161. https://doi.org/10.1016/j.neucom.2013.07.016
Author information
Authors and Affiliations
Corresponding author
Additional information
Project supported by the National Key R&D Program of China (No. 2018YFB1702300) and the National Natural Science Foundation of China (Nos. 61722312 and 61533017)
Contributors
Hongyang LI designed the method, conducted the simulation, and drafted the paper. Qinglai WEI revised and finalized the paper.
Compliance with ethics guidelines
Hongyang LI and Qinglai WEI declare that they have no conflict of interest.
Rights and permissions
About this article
Cite this article
Li, H., Wei, Q. Optimal synchronization control for multi-agent systems with input saturation: a nonzero-sum game. Front Inform Technol Electron Eng 23, 1010–1019 (2022). https://doi.org/10.1631/FITEE.2200010
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1631/FITEE.2200010
Key words
- Optimal synchronization control
- Multi-agent systems
- Nonzero-sum game
- Adaptive dynamic programming
- Input saturation
- Off-policy reinforcement learning
- Policy iteration