Value iteration algorithm for continuous-time linear quadratic stochastic optimal control problems

Wang, Guangchen; Zhang, Heng

doi:10.1007/s11432-023-3820-3

Value iteration algorithm for continuous-time linear quadratic stochastic optimal control problems

Research Paper
Published: 25 January 2024

Volume 67, article number 122204, (2024)
Cite this article

Science China Information Sciences Aims and scope Submit manuscript

Guangchen Wang¹ &
Heng Zhang¹

141 Accesses
2 Citations
Explore all metrics

Abstract

In this study, we investigate a continuous-time infinite-horizon linear quadratic stochastic optimal control problem with multiplicative noise in control and state variables. Using the techniques of stochastic stability, exact observability, and stochastic approximation, a value iteration algorithm is developed to solve the corresponding generalized algebraic Riccati equation. Unlike the existing policy iteration algorithm, this algorithm does not rely on an initial stabilizing control. Further, this algorithm can also be used to compute policy evaluation steps that arise in the policy iteration algorithm. Herein, a simulation example is provided to validate the obtained results.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

References

Wonham W M. On a matrix Riccati equation of stochastic control. SIAM J Control, 1968, 6: 681–697
Article MathSciNet Google Scholar
Kolmanovsky V B, Shaikhet L E. Control of Systems with Aftereffect. Providence: American Mathematical Society, 1996
Book Google Scholar
Yong J M, Zhou X Y. Stochastic Control: Hamiltonian Systems and HJB Equations. New York: Springer, 1999
Book Google Scholar
Wang B C, Zhang H S, Zhang J F. Linear quadratic mean field social control with common noise: a directly decoupling method. Automatica, 2022, 146: 110619
Article MathSciNet Google Scholar
Wang G C, Wu Z. A maximum principle for mean-field stochastic control system with noisy observation. Automatica, 2022, 137: 110135
Article MathSciNet Google Scholar
Han Y C, Sun Y F. Stochastic linear quadratic optimal control problem for systems driven by fractional Brownian motions. Optim Control Appl Methods, 2019, 40: 900–913
Article MathSciNet Google Scholar
Peng C C, Zhang W H. Multicriteria optimization problems of finite horizon stochastic cooperative linear-quadratic difference games. Sci China Inf Sci, 2022, 65: 172203
Article MathSciNet Google Scholar
Hafayed M, Abba A, Abbas S. On partial-information optimal singular control problem for mean-field stochastic differential equations driven by Teugels martingales measures. Int J Control, 2016, 89: 397–410
Article MathSciNet Google Scholar
Ait Rami M, Zhou X Y. Linear matrix inequalities, Riccati equations, and indefinite stochastic linear quadratic controls. IEEE Trans Automat Contr, 2000, 45: 1131–1143
Article MathSciNet Google Scholar
Ait Rami M, Chen X, Moore J B, et al. Solvability and asymptotic behavior of generalized Riccati equations arising in indefinite stochastic LQ controls. IEEE Trans Automat Contr, 2001, 46: 428–440
Article MathSciNet Google Scholar
Ni Y H, Fang H T. Policy iteration algorithm for singular controlled diffusion processes. SIAM J Control Optim, 2013, 51: 3844–3862
Article MathSciNet Google Scholar
Damm T, Hinrichsen D. Newton’s method for a rational matrix equation occurring in stochastic control. Linear Algebra Its Appl, 2001, 332–334: 81–109
Article MathSciNet Google Scholar
Zhang W H. Study on algebraic Riccati equation arising from infinite horizon stochastic LQ optimal control. Dissertation for Ph.D. Degree. Hangzhou: Zhejiang University, 1998
Google Scholar
Sun J R, Yong J M. Stochastic linear quadratic optimal control problems in infinite horizon. Appl Math Optim, 2018, 78: 145–183
Article MathSciNet Google Scholar
Vandenberghe L, Boyd S. A primal-dual potential reduction method for problems involving matrix inequalities. Math Programming, 1995, 69: 205–236
Article MathSciNet Google Scholar
Wei Q L, Liu D R, Lin H Q. Value iteration adaptive dynamic programming for optimal control of discrete-time nonlinear systems. IEEE Trans Cybern, 2016, 46: 840–853
Article PubMed Google Scholar
Wei Q L, Lewis F L, Liu D R, et al. Discrete-time local value iteration adaptive dynamic programming: convergence analysis. IEEE Trans Syst Man Cybern Syst, 2018, 48: 875–891
Article Google Scholar
Kleinman D L. On an iterative technique for Riccati equation computations. IEEE Trans Automat Contr, 1968, 13: 114–115
Article Google Scholar
Bian T, Jiang Z P. Value iteration and adaptive dynamic programming for data-driven adaptive optimal control design. Automatica, 2016, 71: 348–360
Article MathSciNet Google Scholar
Jiang Z P, Bian T, Gao W N. Learning-based control: a tutorial and some recent results. FNT Syst Control, 2020, 8: 176–284
Article Google Scholar
Xie K D, Yu X, Lan W Y. Optimal output regulation for unknown continuous-time linear systems by internal model and adaptive dynamic programming. Automatica, 2022, 146: 110564
Article MathSciNet Google Scholar
Du K, Meng Q X, Zhang F. A Q-learning algorithm for discrete-time linear-quadratic control with random parameters of unknown distribution: convergence and stabilization. SIAM J Control Optim, 2022, 60: 1991–2015
Article MathSciNet Google Scholar
Ljung L. Analysis of recursive stochastic algorithms. IEEE Trans Automat Contr, 1977, 22: 551–575
Article MathSciNet Google Scholar
Kushner H J, Clark D S. Stochastic Approximation Methods for Constrained and Unconstrained Systems. New York: Springer, 1978
Book Google Scholar
Kleinman D L. Numerical solution of the state dependent noise problem. IEEE Trans Automat Contr, 1976, 21: 419–420
Article Google Scholar
Zhang W H, Chen B S. On stabilizability and exact observability of stochastic systems with their applications. Automatica, 2004, 40: 87–94
Article MathSciNet Google Scholar
Khalil H K. Nonlinear Systems. Englewood Cliffs: Prentice Hall, 2002
Google Scholar
Chen S P, Li X J, Zhou X Y. Stochastic linear quadratic regulators with indefinite control weight costs. SIAM J Control Optim, 1998, 36: 1685–1702
Article MathSciNet Google Scholar

Download references

Acknowledgements

This work was supported by National Natural Science Foundation of China (Grant Nos. 61925306, 61821004, 11831010), National Key R&D Program of China (Grant No. 2022YFA1006103), and Natural Science Foundation of Shandong Province (Grant Nos. ZR2019ZD42, ZR2020ZD24). The authors greatly appreciate the efforts of anonymous reviewers, which have improved the quality of this paper.

Author information

Authors and Affiliations

School of Control Science and Engineering, Shandong University, Jinan, 250061, China
Guangchen Wang & Heng Zhang

Authors

Guangchen Wang
View author publications
You can also search for this author in PubMed Google Scholar
Heng Zhang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Heng Zhang.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Wang, G., Zhang, H. Value iteration algorithm for continuous-time linear quadratic stochastic optimal control problems. Sci. China Inf. Sci. 67, 122204 (2024). https://doi.org/10.1007/s11432-023-3820-3

Download citation

Received: 20 February 2023
Revised: 26 April 2023
Accepted: 08 June 2023
Published: 25 January 2024
DOI: https://doi.org/10.1007/s11432-023-3820-3

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Value iteration algorithm for continuous-time linear quadratic stochastic optimal control problems

Abstract

Access this article

Subscribe and save

Buy Now

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now

Search

Navigation