References
Huang M, Caines P E, Malhame R P. Large-population cost-coupled LQG problems with nonuniform agents: individual-mass behavior and decentralized ε-nash equilibria. IEEE Trans Autom Control, 2007, 52: 1560–1571
Li T, Zhang J F. Asymptotically optimal decentralized control for large population stochastic multiagent systems. IEEE Trans Autom Control, 2008, 53: 1643–1660
Wang B C, Zhang H, Zhang J F. Mean field linear-quadratic control: uniform stabilization and social optimality. Automatica, 2020, 121: 109088
Bian T, Jiang Z P. Continuous-time robust dynamic programming. SIAM J Control Optim, 2019, 57: 4150–4174
Xu Z, Shen T, Huang M. Model-free policy iteration approach to NCE-based strategy design for linear quadratic Gaussian games. Automatica, 2023, 155: 111162
Li N, Li X, Xu Z Q. Policy iteration reinforcement learning method for continuous-time mean-field linear-quadratic optimal problem. 2023. ArXiv:2305.00424
Acknowledgements
This work was supported by National Natural Science Foundation of China (Grant No. 62122043).
Author information
Authors and Affiliations
Corresponding author
Additional information
Supporting information Appendixes A–E. The supporting information is available online at info.scichina.com and link.springer.com. The supporting materials are published as submitted, without typesetting or editing. The responsibility for scientific accuracy and content remains entirely with the authors.
Rights and permissions
About this article
Cite this article
Wang, BC., Li, S. & Cao, Y. An online value iteration method for linear-quadratic mean field social control with unknown dynamics. Sci. China Inf. Sci. 67, 140203 (2024). https://doi.org/10.1007/s11432-023-3962-0
Received:
Revised:
Accepted:
Published:
DOI: https://doi.org/10.1007/s11432-023-3962-0