Control the population of free viruses in nonlinear uncertain HIV system using Q-learning

Gholizade-Narm, Hossein; Noori, Amin

doi:10.1007/s13042-017-0639-y

Control the population of free viruses in nonlinear uncertain HIV system using Q-learning

Original Article
Published: 27 February 2017

Volume 9, pages 1169–1179, (2018)
Cite this article

International Journal of Machine Learning and Cybernetics Aims and scope Submit manuscript

Hossein Gholizade-Narm¹ &
Amin Noori^1,2

311 Accesses
11 Citations
Explore all metrics

Abstract

This paper surveys a new method to reduce the infected cells and free virus particles (virions) via a nonlinear HIV model. Three scenarios are considered for control performance evaluation. At first, the system and initial conditions are considered known completely. In the second case, the initial conditions are taken randomly. In the third scenario, in addition to uncertainty in initial condition, an additive noise is taken into account. The optimal control method is used to design an effective drug-schedule to reduce the number of infected cells and free virions with and without uncertainty. By using the Q-learning algorithm, which is the most applicable algorithm in reinforcement learning, the drug delivery rate is obtained off-line. Since Q-learning is a model-free algorithm, it is expected that the performance of the control in the presence of uncertainty does not change significantly. Simulation results confirm that the proposed control method has a good performance and high functionality in controlling the free virions for both certain and uncertain HIV models.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

HIV Therapy via Noncomputational Optimal Control Approach

Nonlinear Sub-optimal Control Design for Suppressing HIV Replication

Article 14 June 2022

Shabnam Ghareh Mohammadlou, Reza Shadi & Ahmad Fakharian

A State Dependent Approach to Resource Allocation Strategies

References

Jiang X, Burke V, Totrov M, Williams C, Cardozo T, Gomy MK, Pazner SZ, Kong XP (2010) Conserved structural elements in the V3 crown of HIV-1 gp120. Nat Struct Mol Biol 17:955–961
Article Google Scholar
Wein L, Zenio S, Nowak M (1997) Dynamics multidrug therapies for HIV: a theoretic approach. J Theor Biol 185:15–29
Article Google Scholar
Ge S, Tian Z, Lee T (2005) Nonlinear control of a dynamic model of HIV-1. IEEE Trans Biomed Eng 52(3):353–361
Article Google Scholar
Brandt ME, Chen G (2001) Feedback control of a biodynamical model of HIV-1. IEEE Trans Biomed Eng 48(7):754–759
Article Google Scholar
Ledzewicz U, Schattler H (2002) On optimal controls for a general mathematical model for chemotherapy of HIV. In: Proceedings of the American control conference, pp 3454–3459
Ouattara DA (2005) Mathematical analysis of the HIV-1 infection: parameter estimation, therapies effectiveness and therapeutical failures. The 27th annual conference on engineering in medicine and biology, September 1–4, 2005, Shanghai, China
Kirschner D, Lenhart S, Serbin S (1997) Optimal control of the chemotherapy of HIV. J Math Biol 35:775–792
Article MathSciNet MATH Google Scholar
Kubiak S, Lehr H, Levy R, Moeller T, Parker A, Swim E (2001) Modeling control of HIV infection through structured treatment interruptions with recommendations for experimental protocol. CRSC Technical Report (CRSCTR01-27)
Kutch JJ, Gurfil P (2002) Optimal control of HIV infection with a continuously-mutating viral population. In: Proceedings of American control conference, pp 4033–4038
H Shim, SJ Han, CC Chung, SW Nam, JH Seo (2003) Optimal scheduling of drug treatment for HIV infection: continues dose control and receding horizon control. Int J Control Autom Syst 1(3):282–288
Google Scholar
Kaelbling LP, Littman ML, Moore AW (1996) Reinforcement learning: a survey. J Artif Intell:237–285
Sutton RS, Barto AG (1998) Reinforcement learning: an introduction. MIT Press, Cambridge
Google Scholar
Bertsekas DP (2007) Dynamic programming and optimal control, 3 ed. Athena Scientic, Belmont
MATH Google Scholar
Shoham Y, Powers R, Grenager T (2003) Multi-agent reinforcement learning: a critical survey. Web Manuscript
Cao XR (2007) Stochastic learning and optimization: a sensitivity-based approach. Springer, Berlin
Book MATH Google Scholar
Powell WB (2007) Approximate dynamic programming: solving the curses of dimensionality. Wiley, New York
Book MATH Google Scholar
Chang HS, Fu MC, Hu J, Marcus SI (2008) Simulation-based algorithms for markov decision processes. Springer, Berlin
MATH Google Scholar
Taylor ME, Stone P (2009) Transfer learning for reinforcement learning domains: a survey. J Mach Learn Res 10:1633–1685
MathSciNet MATH Google Scholar
Wiering MO, Otterlo MV (2012) Reinforcement learning state-of-the-art. Springer, Berlin
Book Google Scholar
Faust A (2012) Reinforcement learning as a motion planner—a survey. Technical report, University of New Mexico, Department of Computer Science, 2012. Online: http://www.cs.unm.edu/~pdevineni/papers/Faust.pdf
Kober J, Bagnell JA, Peters J (2013) Reinforcement learning in robotics: a survey. Int J Robot Res
Liu DR, Li HL, Wang D (2015) Feature selection and feature learning for high-dimensional batch reinforcement learning: a survey. Int J Autom Comp:1–14
García J, Fernando F (2015) A comprehensive survey on safe reinforcement learning. J Mach Learn Res 16:1437–1480
MathSciNet MATH Google Scholar
Orellana JM (2011) Optimal drug scheduling for HIV therapy efficiency improvement. Biomed Signal Process Control 6:379–386
Article Google Scholar
Costanza V, Rivadeneira PS, Biafore FL, D’Attellis CE (2013) Optimizing thymic recovery in HIV patients through multidrug therapies. Biomed Signal Process Control 8:90–97
Article Google Scholar
Agusto FB, Adekunle AI (2014) Optimal control of a two-strain tuberculosis-HIV/AIDS co-infection model. Biosystems 119:20–44
Article Google Scholar
Guo BZ, Sun B (2012) Dynamic programming approach to the numerical solution of optimal control with paradigm by a mathematical model for drug therapies of HIV/AIDS. Optim Eng 115:119–136
MathSciNet MATH Google Scholar
Wang D et al (2009) A comparison of three computational modelling methods for the prediction of virological response to combination HIV therapy. Artif Intell Med 47:63–74
Article Google Scholar
Abharian E, Sarabi SZ, Yomi M (2014) Optimal sigmoid nonlinear stochastic control of HIV-1 infection based on bacteria foraging optimization method. Biomed Signal Process Control 10:184–191
Article Google Scholar
Parbhoo S (2014) A reinforcement learning design for HIV clinical trials. PhD Diss
Gaweda E et al (2005) Individualization of pharmacological anemia management using reinforcement learning. Neural Netw 18:826–834
Article Google Scholar
Noori A, Naghibi Sistani MB, Pariz N (2011) Hepatitis B virus infection control using reinforcement learning, presented at the ICEEE
Yassini S, Naghibi-Sistani MB (2009) Agent-based simulation for blood glucose control in diabetic patients. Int J Appl Sci Eng Technol 5:2009
Google Scholar
Wong WC, Lee JH (2008) A reinforcement learning based scheme for adaptive optimal control of linear stochastic systems. American Control Conference, Seatle, Washington, USA, June 2008
Kamina RW, Makuch, H Zhao (2001) A stochastic modeling of early HIV-1 population dynamics. J Math Biosci 170:187–198
Article MathSciNet MATH Google Scholar
Alazabi FA, Zohdy MA (2012) Nonlinear uncertain HIV-1 model controller by using control Lyapunov function. Int J Mod Nonlinear Theory Appl:33–39
Wodarz D, Nowak MA (2002) Mathematical models of HIV pathogenesis and treatment. Bioessays 24:1178–1187
Article Google Scholar
Ortega H, Martin-Landrove M (1999) A model for continuously mutant HIV-1. In: Proceedings of 22nd annual EMBS international conference, Chicago, pp 1917–1920, 2000
Perelson AS, Nelson PW (1999) Mathematical analysis of HIV-1 dynamics in vivo. SIAM Rev 41(1):3–44
Article MathSciNet MATH Google Scholar
Wodarz D, Nowak MA (1999) Specific therapy regimes could lead to long-term immunological control of HIV. Proc Natl Acad Sci 96(25):14464–14469
Article Google Scholar
Wodarz D (2001) Helper-dependent vs. helper-independent CTL responses in HIV infection: implications for drug therapy and resistance. J Theor Biol 213:447–459
Article Google Scholar
Jeffrey M, Xia X, Craig I (2003) When to initiate HIV therapy: a control theoretic approach. IEEE Trans Biomed Eng 50(11):1213–1220
Article Google Scholar
Perelson AS (1989) Modeling the interaction of the immune system with HIV, Castillo–Chavez, mathematical and statistical approaches to AIDS epidemiology, (Lect. Notes in Biomath 83, pp. 350–370). Springer, New York, p 1989
Google Scholar
Perelson A, Kirschner D, DeBoer R (1993) The dynamics of HIV infection of CD4 T-cells. Math Biosci 114:125
Article MATH Google Scholar
Watkins C (1998) Learning from delayed rewards. Ph. D. Dissertation Cambridge University
Chen CT (1995) Linear system theory and design, 3rd edition. Oxford University Press, Oxford
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Control Engineering, Shahrood University of Technology, Shahrood, Iran
Hossein Gholizade-Narm & Amin Noori
Sadjad University of Technology, Mashhad, Iran
Amin Noori

Authors

Hossein Gholizade-Narm
View author publications
You can also search for this author in PubMed Google Scholar
Amin Noori
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Hossein Gholizade-Narm.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Gholizade-Narm, H., Noori, A. Control the population of free viruses in nonlinear uncertain HIV system using Q-learning. Int. J. Mach. Learn. & Cyber. 9, 1169–1179 (2018). https://doi.org/10.1007/s13042-017-0639-y

Download citation

Received: 03 February 2015
Accepted: 18 January 2017
Published: 27 February 2017
Issue Date: July 2018
DOI: https://doi.org/10.1007/s13042-017-0639-y

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Control the population of free viruses in nonlinear uncertain HIV system using Q-learning

Abstract

Access this article

Similar content being viewed by others

HIV Therapy via Noncomputational Optimal Control Approach

Nonlinear Sub-optimal Control Design for Suppressing HIV Replication

A State Dependent Approach to Resource Allocation Strategies

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Control the population of free viruses in nonlinear uncertain HIV system using Q-learning

Abstract

Access this article

Similar content being viewed by others

HIV Therapy via Noncomputational Optimal Control Approach

Nonlinear Sub-optimal Control Design for Suppressing HIV Replication

A State Dependent Approach to Resource Allocation Strategies

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation