Skip to main content

Advertisement

Log in

Control the population of free viruses in nonlinear uncertain HIV system using Q-learning

  • Original Article
  • Published:
International Journal of Machine Learning and Cybernetics Aims and scope Submit manuscript

Abstract

This paper surveys a new method to reduce the infected cells and free virus particles (virions) via a nonlinear HIV model. Three scenarios are considered for control performance evaluation. At first, the system and initial conditions are considered known completely. In the second case, the initial conditions are taken randomly. In the third scenario, in addition to uncertainty in initial condition, an additive noise is taken into account. The optimal control method is used to design an effective drug-schedule to reduce the number of infected cells and free virions with and without uncertainty. By using the Q-learning algorithm, which is the most applicable algorithm in reinforcement learning, the drug delivery rate is obtained off-line. Since Q-learning is a model-free algorithm, it is expected that the performance of the control in the presence of uncertainty does not change significantly. Simulation results confirm that the proposed control method has a good performance and high functionality in controlling the free virions for both certain and uncertain HIV models.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16

Similar content being viewed by others

References

  1. Jiang X, Burke V, Totrov M, Williams C, Cardozo T, Gomy MK, Pazner SZ, Kong XP (2010) Conserved structural elements in the V3 crown of HIV-1 gp120. Nat Struct Mol Biol 17:955–961

    Article  Google Scholar 

  2. Wein L, Zenio S, Nowak M (1997) Dynamics multidrug therapies for HIV: a theoretic approach. J Theor Biol 185:15–29

    Article  Google Scholar 

  3. Ge S, Tian Z, Lee T (2005) Nonlinear control of a dynamic model of HIV-1. IEEE Trans Biomed Eng 52(3):353–361

    Article  Google Scholar 

  4. Brandt ME, Chen G (2001) Feedback control of a biodynamical model of HIV-1. IEEE Trans Biomed Eng 48(7):754–759

    Article  Google Scholar 

  5. Ledzewicz U, Schattler H (2002) On optimal controls for a general mathematical model for chemotherapy of HIV. In: Proceedings of the American control conference, pp 3454–3459

  6. Ouattara DA (2005) Mathematical analysis of the HIV-1 infection: parameter estimation, therapies effectiveness and therapeutical failures. The 27th annual conference on engineering in medicine and biology, September 1–4, 2005, Shanghai, China

  7. Kirschner D, Lenhart S, Serbin S (1997) Optimal control of the chemotherapy of HIV. J Math Biol 35:775–792

    Article  MathSciNet  MATH  Google Scholar 

  8. Kubiak S, Lehr H, Levy R, Moeller T, Parker A, Swim E (2001) Modeling control of HIV infection through structured treatment interruptions with recommendations for experimental protocol. CRSC Technical Report (CRSCTR01-27)

  9. Kutch JJ, Gurfil P (2002) Optimal control of HIV infection with a continuously-mutating viral population. In: Proceedings of American control conference, pp 4033–4038

  10. H Shim, SJ Han, CC Chung, SW Nam, JH Seo (2003) Optimal scheduling of drug treatment for HIV infection: continues dose control and receding horizon control. Int J Control Autom Syst 1(3):282–288

    Google Scholar 

  11. Kaelbling LP, Littman ML, Moore AW (1996) Reinforcement learning: a survey. J Artif Intell:237–285

  12. Sutton RS, Barto AG (1998) Reinforcement learning: an introduction. MIT Press, Cambridge

    Google Scholar 

  13. Bertsekas DP (2007) Dynamic programming and optimal control, 3 ed. Athena Scientic, Belmont

    MATH  Google Scholar 

  14. Shoham Y, Powers R, Grenager T (2003) Multi-agent reinforcement learning: a critical survey. Web Manuscript

  15. Cao XR (2007) Stochastic learning and optimization: a sensitivity-based approach. Springer, Berlin

    Book  MATH  Google Scholar 

  16. Powell WB (2007) Approximate dynamic programming: solving the curses of dimensionality. Wiley, New York

    Book  MATH  Google Scholar 

  17. Chang HS, Fu MC, Hu J, Marcus SI (2008) Simulation-based algorithms for markov decision processes. Springer, Berlin

    MATH  Google Scholar 

  18. Taylor ME, Stone P (2009) Transfer learning for reinforcement learning domains: a survey. J Mach Learn Res 10:1633–1685

    MathSciNet  MATH  Google Scholar 

  19. Wiering MO, Otterlo MV (2012) Reinforcement learning state-of-the-art. Springer, Berlin

    Book  Google Scholar 

  20. Faust A (2012) Reinforcement learning as a motion planner—a survey. Technical report, University of New Mexico, Department of Computer Science, 2012. Online: http://www.cs.unm.edu/~pdevineni/papers/Faust.pdf

  21. Kober J, Bagnell JA, Peters J (2013) Reinforcement learning in robotics: a survey. Int J Robot Res

  22. Liu DR, Li HL, Wang D (2015) Feature selection and feature learning for high-dimensional batch reinforcement learning: a survey. Int J Autom Comp:1–14

  23. García J, Fernando F (2015) A comprehensive survey on safe reinforcement learning. J Mach Learn Res 16:1437–1480

    MathSciNet  MATH  Google Scholar 

  24. Orellana JM (2011) Optimal drug scheduling for HIV therapy efficiency improvement. Biomed Signal Process Control 6:379–386

    Article  Google Scholar 

  25. Costanza V, Rivadeneira PS, Biafore FL, D’Attellis CE (2013) Optimizing thymic recovery in HIV patients through multidrug therapies. Biomed Signal Process Control 8:90–97

    Article  Google Scholar 

  26. Agusto FB, Adekunle AI (2014) Optimal control of a two-strain tuberculosis-HIV/AIDS co-infection model. Biosystems 119:20–44

    Article  Google Scholar 

  27. Guo BZ, Sun B (2012) Dynamic programming approach to the numerical solution of optimal control with paradigm by a mathematical model for drug therapies of HIV/AIDS. Optim Eng 115:119–136

    MathSciNet  MATH  Google Scholar 

  28. Wang D et al (2009) A comparison of three computational modelling methods for the prediction of virological response to combination HIV therapy. Artif Intell Med 47:63–74

    Article  Google Scholar 

  29. Abharian E, Sarabi SZ, Yomi M (2014) Optimal sigmoid nonlinear stochastic control of HIV-1 infection based on bacteria foraging optimization method. Biomed Signal Process Control 10:184–191

    Article  Google Scholar 

  30. Parbhoo S (2014) A reinforcement learning design for HIV clinical trials. PhD Diss

  31. Gaweda E et al (2005) Individualization of pharmacological anemia management using reinforcement learning. Neural Netw 18:826–834

    Article  Google Scholar 

  32. Noori A, Naghibi Sistani MB, Pariz N (2011) Hepatitis B virus infection control using reinforcement learning, presented at the ICEEE

  33. Yassini S, Naghibi-Sistani MB (2009) Agent-based simulation for blood glucose control in diabetic patients. Int J Appl Sci Eng Technol 5:2009

    Google Scholar 

  34. Wong WC, Lee JH (2008) A reinforcement learning based scheme for adaptive optimal control of linear stochastic systems. American Control Conference, Seatle, Washington, USA, June 2008

  35. Kamina RW, Makuch, H Zhao (2001) A stochastic modeling of early HIV-1 population dynamics. J Math Biosci 170:187–198

    Article  MathSciNet  MATH  Google Scholar 

  36. Alazabi FA, Zohdy MA (2012) Nonlinear uncertain HIV-1 model controller by using control Lyapunov function. Int J Mod Nonlinear Theory Appl:33–39

  37. Wodarz D, Nowak MA (2002) Mathematical models of HIV pathogenesis and treatment. Bioessays 24:1178–1187

    Article  Google Scholar 

  38. Ortega H, Martin-Landrove M (1999) A model for continuously mutant HIV-1. In: Proceedings of 22nd annual EMBS international conference, Chicago, pp 1917–1920, 2000

  39. Perelson AS, Nelson PW (1999) Mathematical analysis of HIV-1 dynamics in vivo. SIAM Rev 41(1):3–44

    Article  MathSciNet  MATH  Google Scholar 

  40. Wodarz D, Nowak MA (1999) Specific therapy regimes could lead to long-term immunological control of HIV. Proc Natl Acad Sci 96(25):14464–14469

    Article  Google Scholar 

  41. Wodarz D (2001) Helper-dependent vs. helper-independent CTL responses in HIV infection: implications for drug therapy and resistance. J Theor Biol 213:447–459

    Article  Google Scholar 

  42. Jeffrey M, Xia X, Craig I (2003) When to initiate HIV therapy: a control theoretic approach. IEEE Trans Biomed Eng 50(11):1213–1220

    Article  Google Scholar 

  43. Perelson AS (1989) Modeling the interaction of the immune system with HIV, Castillo–Chavez, mathematical and statistical approaches to AIDS epidemiology, (Lect. Notes in Biomath 83, pp. 350–370). Springer, New York, p 1989

    Google Scholar 

  44. Perelson A, Kirschner D, DeBoer R (1993) The dynamics of HIV infection of CD4 T-cells. Math Biosci 114:125

    Article  MATH  Google Scholar 

  45. Watkins C (1998) Learning from delayed rewards. Ph. D. Dissertation Cambridge University

  46. Chen CT (1995) Linear system theory and design, 3rd edition. Oxford University Press, Oxford

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Hossein Gholizade-Narm.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Gholizade-Narm, H., Noori, A. Control the population of free viruses in nonlinear uncertain HIV system using Q-learning. Int. J. Mach. Learn. & Cyber. 9, 1169–1179 (2018). https://doi.org/10.1007/s13042-017-0639-y

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s13042-017-0639-y

Keywords

Navigation