Skip to main content
Log in

Value iteration algorithm for continuous-time linear quadratic stochastic optimal control problems

  • Research Paper
  • Published:
Science China Information Sciences Aims and scope Submit manuscript

Abstract

In this study, we investigate a continuous-time infinite-horizon linear quadratic stochastic optimal control problem with multiplicative noise in control and state variables. Using the techniques of stochastic stability, exact observability, and stochastic approximation, a value iteration algorithm is developed to solve the corresponding generalized algebraic Riccati equation. Unlike the existing policy iteration algorithm, this algorithm does not rely on an initial stabilizing control. Further, this algorithm can also be used to compute policy evaluation steps that arise in the policy iteration algorithm. Herein, a simulation example is provided to validate the obtained results.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

References

  1. Wonham W M. On a matrix Riccati equation of stochastic control. SIAM J Control, 1968, 6: 681–697

    Article  MathSciNet  Google Scholar 

  2. Kolmanovsky V B, Shaikhet L E. Control of Systems with Aftereffect. Providence: American Mathematical Society, 1996

    Book  Google Scholar 

  3. Yong J M, Zhou X Y. Stochastic Control: Hamiltonian Systems and HJB Equations. New York: Springer, 1999

    Book  Google Scholar 

  4. Wang B C, Zhang H S, Zhang J F. Linear quadratic mean field social control with common noise: a directly decoupling method. Automatica, 2022, 146: 110619

    Article  MathSciNet  Google Scholar 

  5. Wang G C, Wu Z. A maximum principle for mean-field stochastic control system with noisy observation. Automatica, 2022, 137: 110135

    Article  MathSciNet  Google Scholar 

  6. Han Y C, Sun Y F. Stochastic linear quadratic optimal control problem for systems driven by fractional Brownian motions. Optim Control Appl Methods, 2019, 40: 900–913

    Article  MathSciNet  Google Scholar 

  7. Peng C C, Zhang W H. Multicriteria optimization problems of finite horizon stochastic cooperative linear-quadratic difference games. Sci China Inf Sci, 2022, 65: 172203

    Article  MathSciNet  Google Scholar 

  8. Hafayed M, Abba A, Abbas S. On partial-information optimal singular control problem for mean-field stochastic differential equations driven by Teugels martingales measures. Int J Control, 2016, 89: 397–410

    Article  MathSciNet  Google Scholar 

  9. Ait Rami M, Zhou X Y. Linear matrix inequalities, Riccati equations, and indefinite stochastic linear quadratic controls. IEEE Trans Automat Contr, 2000, 45: 1131–1143

    Article  MathSciNet  Google Scholar 

  10. Ait Rami M, Chen X, Moore J B, et al. Solvability and asymptotic behavior of generalized Riccati equations arising in indefinite stochastic LQ controls. IEEE Trans Automat Contr, 2001, 46: 428–440

    Article  MathSciNet  Google Scholar 

  11. Ni Y H, Fang H T. Policy iteration algorithm for singular controlled diffusion processes. SIAM J Control Optim, 2013, 51: 3844–3862

    Article  MathSciNet  Google Scholar 

  12. Damm T, Hinrichsen D. Newton’s method for a rational matrix equation occurring in stochastic control. Linear Algebra Its Appl, 2001, 332–334: 81–109

    Article  MathSciNet  Google Scholar 

  13. Zhang W H. Study on algebraic Riccati equation arising from infinite horizon stochastic LQ optimal control. Dissertation for Ph.D. Degree. Hangzhou: Zhejiang University, 1998

    Google Scholar 

  14. Sun J R, Yong J M. Stochastic linear quadratic optimal control problems in infinite horizon. Appl Math Optim, 2018, 78: 145–183

    Article  MathSciNet  Google Scholar 

  15. Vandenberghe L, Boyd S. A primal-dual potential reduction method for problems involving matrix inequalities. Math Programming, 1995, 69: 205–236

    Article  MathSciNet  Google Scholar 

  16. Wei Q L, Liu D R, Lin H Q. Value iteration adaptive dynamic programming for optimal control of discrete-time nonlinear systems. IEEE Trans Cybern, 2016, 46: 840–853

    Article  PubMed  Google Scholar 

  17. Wei Q L, Lewis F L, Liu D R, et al. Discrete-time local value iteration adaptive dynamic programming: convergence analysis. IEEE Trans Syst Man Cybern Syst, 2018, 48: 875–891

    Article  Google Scholar 

  18. Kleinman D L. On an iterative technique for Riccati equation computations. IEEE Trans Automat Contr, 1968, 13: 114–115

    Article  Google Scholar 

  19. Bian T, Jiang Z P. Value iteration and adaptive dynamic programming for data-driven adaptive optimal control design. Automatica, 2016, 71: 348–360

    Article  MathSciNet  Google Scholar 

  20. Jiang Z P, Bian T, Gao W N. Learning-based control: a tutorial and some recent results. FNT Syst Control, 2020, 8: 176–284

    Article  Google Scholar 

  21. Xie K D, Yu X, Lan W Y. Optimal output regulation for unknown continuous-time linear systems by internal model and adaptive dynamic programming. Automatica, 2022, 146: 110564

    Article  MathSciNet  Google Scholar 

  22. Du K, Meng Q X, Zhang F. A Q-learning algorithm for discrete-time linear-quadratic control with random parameters of unknown distribution: convergence and stabilization. SIAM J Control Optim, 2022, 60: 1991–2015

    Article  MathSciNet  Google Scholar 

  23. Ljung L. Analysis of recursive stochastic algorithms. IEEE Trans Automat Contr, 1977, 22: 551–575

    Article  MathSciNet  Google Scholar 

  24. Kushner H J, Clark D S. Stochastic Approximation Methods for Constrained and Unconstrained Systems. New York: Springer, 1978

    Book  Google Scholar 

  25. Kleinman D L. Numerical solution of the state dependent noise problem. IEEE Trans Automat Contr, 1976, 21: 419–420

    Article  Google Scholar 

  26. Zhang W H, Chen B S. On stabilizability and exact observability of stochastic systems with their applications. Automatica, 2004, 40: 87–94

    Article  MathSciNet  Google Scholar 

  27. Khalil H K. Nonlinear Systems. Englewood Cliffs: Prentice Hall, 2002

    Google Scholar 

  28. Chen S P, Li X J, Zhou X Y. Stochastic linear quadratic regulators with indefinite control weight costs. SIAM J Control Optim, 1998, 36: 1685–1702

    Article  MathSciNet  Google Scholar 

Download references

Acknowledgements

This work was supported by National Natural Science Foundation of China (Grant Nos. 61925306, 61821004, 11831010), National Key R&D Program of China (Grant No. 2022YFA1006103), and Natural Science Foundation of Shandong Province (Grant Nos. ZR2019ZD42, ZR2020ZD24). The authors greatly appreciate the efforts of anonymous reviewers, which have improved the quality of this paper.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Heng Zhang.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Wang, G., Zhang, H. Value iteration algorithm for continuous-time linear quadratic stochastic optimal control problems. Sci. China Inf. Sci. 67, 122204 (2024). https://doi.org/10.1007/s11432-023-3820-3

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s11432-023-3820-3

Keywords

Navigation