Skip to main content
Log in

Cancer cells population control in a delayed-model of a leukemic patient using the combination of the eligibility traces algorithm and neural networks

  • Original Article
  • Published:
International Journal of Machine Learning and Cybernetics Aims and scope Submit manuscript

Abstract

The main purpose of this paper is to provide a solution, through which one can efficiently reduce the population of cancer cells by injecting the lowest dose of the drug; therefore, reducing the side effects of the drug on healthy cells. In this paper, a mathematical model of stem Chronic Myelogenous Leukemia (CML) is used. To this aim, a hybrid method is used, that is a combination of the Eligibility Traces algorithm and Neural Networks. The eligibility traces algorithm is one of the well-known methods for solving problems under the Reinforcement Learning (RL) approach. The reason is that the population of cancer cells can be controlled with a higher accuracy and will have a significant impact on dosage of injection. The eligibility traces algorithm has the advantage of backward view, meaning it will investigate previous states, as well. That will result in improving the learning procedure, speed of reduction in cancer cells population and the total dosage of the injected drug during the treatment period, in patients with CML. Combination of the mentioned method and neural networks has provided continuous states in the considered problem. Hence, there will be no limitation for considering all possible states for solving the problem. Moreover, this can accelerate obtaining the optimal dosage with a high accuracy, which is a significant advantage of the proposed method. To show the effectiveness of the proposed method to control the population of cancer cells and obtaining the optimal dosage, it is compared with four different cases: when only the eligibility traces algorithm is employed, in the case only the Q-learning algorithm is used, when the Optimal Control is applied and in the case no dosage is injected. Finally, it is revealed that the combinatory method of the eligibility traces algorithm and neural networks can control the population of cancer cells more quickly, with a higher accuracy as well as applying a lower dosage of the drug.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16
Fig. 17

Similar content being viewed by others

References

  1. Aïnseba BE, Benosman C (2010) Optimal control for resistance and suboptimal response in CML. Math Biosci 227:81–93

    Article  MathSciNet  Google Scholar 

  2. Aïnseba BE, Benosman C (2011) Cml dynamics: optimal control of age-structured stem cell population. Math Comput Simul 81:1962–1977

    Article  MathSciNet  Google Scholar 

  3. Angstreich GR, Smith BD, Jones RJ (2004) Treatment options for chronic myeloid leukemia: imatinib versus interferon versus allogeneic transplant. Curr Opin Oncol 16:95–99

    Article  Google Scholar 

  4. Banjar H, Adelson D, Brown F, Chaudhri N (2017) Intelligent techniques using molecular data analysis in leukaemia: an opportunity for personalized medicine support system. BioMed Res Int. https://doi.org/10.1155/2017/3587309

  5. Choy MC, Srinivasan D, Cheu RL (2006) Neural networks for continuous online learning and control. IEEE Trans Neural Netw 17:1511–1531

    Article  Google Scholar 

  6. Colijn C, Mackey MC (2005) A mathematical model of hematopoiesis—I. Periodic chronic myelogenous leukemia. J Theoret Biol 237:117–132

    Article  MathSciNet  Google Scholar 

  7. Dai B, Shaw A, Li L, Xiao L, He N, Liu Z, Chen J, Song L (2018) SBEED: convergent reinforcement learning with nonlinear function approximation. In: International conference on machine learning. PMLR, pp 1125–1134

  8. Dayan P (1992) The convergence of TD (λ) for general λ. Mach Learn 8:341–362

  9. Duguleana M, Mogan G (2016) Neural networks based reinforcement learning for mobile robots obstacle avoidance. Expert Syst Appl 62:104–115

    Article  Google Scholar 

  10. Dupuis X (2014) Optimal control of leukemic cell population dynamics. Math Model Nat Phenom 9:4–26

    Article  MathSciNet  Google Scholar 

  11. Faußer S, Schwenker F (2015) Neural network ensembles in reinforcement learning. Neural Process Lett 41:55–69

    Article  Google Scholar 

  12. Gajewski J, Vališ D (2021) Verification of the technical equipment degradation method using a hybrid reinforcement learning trees–artificial neural network system. Tribol Int 153:106618

    Article  Google Scholar 

  13. Gal OAN, Fan Y, Meerzaman D (2019) Predicting complete remission of acute myeloid leukemia: machine learning applied to gene expression. Cancer Inf. https://doi.org/10.1177/1176935119835544

  14. Geramifard A, Bowling M, Zinkevich M, Sutton RS (2007) iLSTD: eligibility traces and convergence analysis. In: Advances in Neural information processing systems, 2007, pp 441–448

  15. Gholizade-Narm H, Noori A (2018) Control the population of free viruses in nonlinear uncertain hiv system using q-learning. Int J Mach Learn Cybern 9:1169–1179

    Article  Google Scholar 

  16. Greer JPFJ, Lukens JN (2003) Wintrobe’s clinical hematology, 11th edn. Lippincott Williams & Wilkins, Philadelphia

    Google Scholar 

  17. Huang B-Q, Cao G-Y, Guo M (2005) Reinforcement learning neural network to the problem of autonomous mobile robot obstacle avoidance. In: 2005 International Conference on Machine Learning and Cybernetics, 2005. IEEE, pp 85–89

  18. Huang C, Clayton EA, Matyunina LV, McDonald LD, Benigno BB, Vannberg F, McDonald JF (2018) Machine learning predicts individual cancer patient responses to therapeutic drugs with high accuracy. Sci Rep 8:16444

    Article  Google Scholar 

  19. Jagadev P, Virani H (2017) Detection of leukemia and its types using image processing and machine learning. In: 2017 International Conference on Trends in Electronics and Informatics (ICEI), 2017. IEEE, pp 522–526

  20. Komarova NL (2011) Mathematical modeling of cyclic treatments of chronic myeloid leukemia. Mathe BiosciEng 8:289

    MathSciNet  MATH  Google Scholar 

  21. Liu H, Yu C, Yu C, Chen C, Wu H (2020) A novel axle temperature forecasting method based on decomposition, reinforcement learning optimization and neural network Advanced Engineering Informatics 44:101089

  22. Liu Z-B, Zeng X-Q, Xu Y, Yu J-G (2015) Learning to control by neural networks using eligibility traces. Cont Theo Appl 32:887–894

  23. Mazdeyasna S, Jafari A, Hadjati J, Allahverdy A, Alavi-Moghaddam M (2015) Modeling the effect of chemotherapy on melanoma B16F10 in mice using cellular automata and genetic algorithm in tapered dosage of fbs and cisplatin. Front Biomed Technol 2:103–108

    Google Scholar 

  24. Moore H, Li NK (2004) A mathematical model for chronic myelogenous leukemia (CML) and T cell interaction. J Theoret Biol 227:513–523

    Article  MathSciNet  Google Scholar 

  25. Mosquera Orgueira A et al (2019) Time to treatment prediction in chronic lymphocytic leukemia based on new transcriptional patterns. Front Oncol 9:79

    Article  Google Scholar 

  26. Nanda S, Moore H, Lenhart S (2007) Optimal control of treatment in a mathematical model of chronic myelogenous leukemia. Math Biosci 210:143–156

    Article  MathSciNet  Google Scholar 

  27. Nissen S (2007) Large scale reinforcement learning using q-sarsa (λ) and cascading neural networks Unpublished masters thesis, Department of Computer Science, University of Copenhagen, København, Denmark

  28. Noel MM, Pandian BJ (2014) Control of a nonlinear liquid level system using a new artificial neural network based reinforcement learning approach. Appl Soft Comput 23:444–451

    Article  Google Scholar 

  29. Noori A, Sadrnia MA (2017) Glucose level control using temporal difference methods. In: 2017 Iranian Conference on Electrical Engineering (ICEE), 2017. IEEE, pp 895–900

  30. Radivoyevitch T, Hlatky L, Landaw J, Sachs RK (2012) Quantitative modeling of chronic myeloid leukemia: insights from radiobiology blood. J Am Soc Hematol 119:4363–4371

    Google Scholar 

  31. Rădulescu I, Cândea D, Halanay AA (2015) Complex mathematical model with competition in leukemia with immune response-an optimal control approach. In: IFIP Conference on System Modeling and Optimization, 2015. Springer, pp 430–441

  32. Rădulescu IR, Cândea D, Halanay AA (2013) Control delay differential equations model of evolution of normal and leukemic cell populations under treatment. In: IFIP Conference on system modeling and optimization, 2013. Springer, pp 257–266

  33. Roeder I, Horn M, Glauche I, Hochhaus A, Mueller MC, Loeffler M (2006) Dynamic modeling of imatinib-treated chronic myeloid leukemia: functional insights and clinical implications. Nat Med 12:1181

    Article  Google Scholar 

  34. Sasaki K et al (2019) The impact of treatment recommendation by leukemia artificial intelligence program (LEAP) on survival in patients with chronic myeloid leukemia in chronic phase (CML-CP). American Society of Hematology, Washington, DC

    Book  Google Scholar 

  35. Schäfer AM (2008) Reinforcement learning with recurrent neural networks

  36. Singh S, Jaakkola T, Littman ML, Szepesvári C (2000) Convergence results for single-step on-policy reinforcement-learning algorithms. Mahc Learn 38:287–308

    Article  Google Scholar 

  37. Sirin U, Polat F, Alhajj R (2013) Employing batch reinforcement learning to control gene regulation without explicitly constructing gene regulatory networks. In: Twenty-Third International Joint Conference on artificial intelligence, 2013

  38. Smart WD, Kaelbling LP Practical reinforcement learning in continuous spaces. In: ICML, 2000. Citeseer, pp 903–910

  39. Srisukkham W, Zhang L, Neoh SC, Todryk S, Lim CP (2017) Intelligent leukaemia diagnosis with bare-bones PSO based feature optimization. Appl Soft Comput 56:405–419

    Article  Google Scholar 

  40. Stanley KO, Miikkulainen R Efficient reinforcement learning through evolving neural network topologies. In: Proceedings of the 4th Annual Conference on genetic and evolutionary computation, 2002, pp 569–577

  41. Sutton RS, Barto AG (2018) Reinforcement learning: an introduction. MIT press, Massachusetts

    MATH  Google Scholar 

  42. Szepesvári C(1998) The asymptotic convergence-rate of Q-learning. In: Advances in neural information processing systems, 1998, pp 1064–1070

  43. Taiwo O, Kasali F, Akinyemi I, Kuyoro S, Awodele O, Ogbaro D, Olaniyan T (2019) Stratification of chronic myeloid leukemia cancer dataset into risk groups using four machine learning algorithms with minimal loss function. Afr J Manag Inf Syst 1:1–18

  44. Todorov Y, Nuernberg F (2014) Optimal therapy protocols in the mathematical model of acute leukemia with several phase constraints. Optim Control Appl Methods 35:559–574

    Article  MathSciNet  Google Scholar 

  45. Tordesillas J, Arbelaiz J (2019) Personalized cancer chemotherapy schedule: a numerical comparison of performance and robustness in model-based and model-free scheduling methodologies arXiv preprint arXiv:190401200

  46. Tseng HH, Luo Y, Cui S, Chien JT, Ten Haken RK, El Naqa I (2017) Deep reinforcement learning for automated radiation adaptation in lung cancer. Med Phys 44:6690–6705

    Article  Google Scholar 

  47. Van Hasselt H (2007) Wiering MA reinforcement learning in continuous action spaces. In: 2007 IEEE International Symposium on approximate dynamic programming and reinforcement learning, 2007. IEEE, pp 272–279

  48. Wang X, Si L, Guo J (2014) Treatment algorithm of metastatic mucosal melanoma. Chin Clin Oncol 3:38

    Google Scholar 

  49. Wei Q (2016) Application of machine learning techniques to acute myeloid leukemia. PhD diss

  50. Xu B, Yang C, Shi Z (2013) Reinforcement learning output feedback NN control using deterministic learning technique. IEEE Trans Neural Netw Learn Syst 25:635–641

    Article  Google Scholar 

  51. Yu H (2015) On convergence of emphatic temporal-difference learning. In: Conference on learning theory, 2015, pp 1724–1751

  52. Zheng Y, Jiang Y (2015) mTOR inhibitors at a glance. Mol Cell Pharmacol 7:15

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Amin Noori.

Ethics declarations

Conflict of interest

The authors declare that there is no conflict of interest with any person(s) or Organization(s).

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Kalhor, E., Noori, A. & Noori, G. Cancer cells population control in a delayed-model of a leukemic patient using the combination of the eligibility traces algorithm and neural networks. Int. J. Mach. Learn. & Cyber. 12, 1973–1992 (2021). https://doi.org/10.1007/s13042-021-01287-8

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s13042-021-01287-8

Keywords

Navigation