Skip to main content
Log in

A closed-loop healthcare processing approach based on deep reinforcement learning

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

In healthcare, the human body is a controlled input-output system, which generates different observations with the variations of external interventions. The intervention acts as the input, and the output is the phenotype observation that reflects the latent health state of the body system. The objective of healthcare is to determine effective intervention strategies that can nurse an unhealthy human body to a healthy state. With the advances of Internet-of-Things (IoT) and body sensor networks, it becomes convenient to observe the multimedia data of the human body anywhere and anytime. To aid healthcare decision making, we put forward to construct the human body simulators based on deep neural networks (DNNs) for healthcare research. At first, we formulate the model of the human body system based on DNNs. During our analysis, we realize that DNN-based models could simulate practical situations, e.g. some health states are unreachable. Then, we combine deep reinforcement learning (DRL) with conceptual embedding techniques to explore effective healthcare strategies for simulated human bodies. We implement a virtual human body simulator, which can take interventions and represent its hidden states by high-dimensional images, and a DRL-based treatment module, which can diagnose latent health state through the image observations and choose interventions to nurse the simulated body to a target state. By combining the body simulator and treatment module, we create a dynamic closed-loop for healthcare information processing. Experimental simulations are performed to validate the feasibility of the offered approach.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7

Similar content being viewed by others

References

  1. Abadi M, Barham P, Chen J, Chen Z, Davis A, Dean J, Devin M, Ghemawat S, Irving G, Isard M, Kudlur M, Levenberg J, Monga R, Moore S, Murray DG, Steiner B, Tucker P, Vasudevan V, Warden P, Wicke M, Yu Y, Zheng X (2016) Tensorflow: a system for large-scale machine learning. In: Proceedings of the 12th USENIX conference on operating systems design and implementation, OSDI’16. USENIX Association, Berkeley, pp 265–283

  2. Caggianese G, Cuomo S, Esposito M, Franceschini M, Gallo L, Infarinato F, Minutolo A, Piccialli F, Romano P (2018) Serious games and in-cloud data analytics for the virtualization and personalization of rehabilitation treatments. IEEE Trans Indust Inform 15(1):517–526

    Article  Google Scholar 

  3. Choi E, Bahadori MT, Schuetz A, Stewart WF, Sun J (2016) Doctor ai: predicting clinical events via recurrent neural networks. In: Machine learning for healthcare conference, pp 301–318

  4. Dai Y, Liu X, Wang G (2018) A body simulator with delayed health state transition. In: 2018 IEEE international conference on ubiquitous intelligence & computing. IEEE, pp 635–641

  5. Dai Y, Wang G (2018) Analyzing tongue images using a conceptual alignment deep autoencoder. IEEE Access 6:5962–5972

    Article  Google Scholar 

  6. Dai Y, Wang G (2018) A deep inference learning framework for healthcare. Pattern Recogn Lett. https://doi.org/10.1016/j.patrec.2018.02.009

  7. Dai Y, Wang G, Chen S, Xie D, Chen S (2017) Using deep neural networks to simulate human body. In: 2017 IEEE international symposium on parallel and distributed processing with applications and 2017 IEEE international conference on ubiquitous computing and communications (ISPA/IUCC). IEEE, pp 959–966

  8. Dai Y, Wang G, Li KC (2018) Conceptual alignment deep neural networks. J Intell Fuzzy Sys 34(3):1631–1642

    Article  Google Scholar 

  9. Dartmann G, Song H, Schmeink A (2019) Big data analytics for cyber-physical systems: machine learning for the internet of things. Elsevier, Amsterdam

    Google Scholar 

  10. Doody RS, Pavlik V, Massman P, Rountree S, Darby E, Chan W (2010) Predicting progression of alzheimer’s disease. Alzheimers Research & Therapy 2 (2):1–9

    Google Scholar 

  11. Esteva A, Kuprel B, Novoa RA, Ko J, Swetter SM, Blau HM, Thrun S (2017) Dermatologist-level classification of skin cancer with deep neural networks. Nature 542:115–118

    Article  Google Scholar 

  12. Esteva A, Robicquet A, Ramsundar B, Kuleshov V, DePristo M, Chou K, Cui C, Corrado G, Thrun S, Dean J (2019) A guide to deep learning in healthcare. Nature Medicine 25(1):24–29

    Article  Google Scholar 

  13. Gu S, Holly E, Lillicrap T, Levine S (2017) Deep reinforcement learning for robotic manipulation with asynchronous off-policy updates. In: 2017 IEEE international conference on robotics and automation (ICRA). IEEE, pp 3389–3396

  14. Hannun AY, Rajpurkar P, Haghpanahi M, Tison GH, Bourn C, Turakhia MP, Ng AY (2019) Cardiologist-level arrhythmia detection and classification in ambulatory electrocardiograms using a deep neural network. Nat Med 25(1):65

    Article  Google Scholar 

  15. Hochreiter S, Schmidhuber J (1997) Long short-term memory. Neural Comput 9(8):1735–1780

    Article  Google Scholar 

  16. Hornik K, Stinchcombe M, White H (1989) Multilayer feedforward networks are universal approximators. Neural Netw 2(5):359–366

    Article  Google Scholar 

  17. Hovorka R (2011) Closed-loop insulin delivery: from bench to clinical practice. Nat Rev Endocrinol 7(7):385

    Article  Google Scholar 

  18. Hu Y, Duan K, Zhang Y, Hossain MS, Rahman SMM, Alelaiwi A (2018) Simultaneously aided diagnosis model for outpatient departments via healthcare big data analytics. Multimed Tools Appl 77(3):3729–3743

    Article  Google Scholar 

  19. Kaur P, Kumar R, Kumar M (2019) A healthcare monitoring system using random forest and internet of things (IoT). Multimed Tools Appl 78:19905–19916

    Article  Google Scholar 

  20. Lecun Y, Bengio Y, Hinton G (2015) Deep learning. Nature 521(7553):436–444

    Article  Google Scholar 

  21. Li J, Monroe W, Ritter A, Dan J, Galley M, Gao J (2016) Deep reinforcement learning for dialogue generation. In: Conference on empirical methods in natural language processing, pp 1192–1202

  22. Lillicrap TP, Hunt JJ, Pritzel A, Heess N, Erez T, Tassa Y, Silver D, Wierstra D (2016) Continuous control with deep reinforcement learning. In: International conference on learning representations (ICLR), pp 1–10

  23. Liu S, Bai W, Liu G, Li W, Srivastava HM (2018) Parallel fractal compression method for big video data. Complexity 2018

  24. Liu S, Bai W, Zeng N, Wang S (2019) A fast fractal based compression for MRI images. IEEE Access 7:62412–62420

    Article  Google Scholar 

  25. Liu S, Guo C, Al-Turjman F, Muhammad K, de Albuquerque VHC (2020) Reliability of response region: a novel mechanism in visual tracking by edge computing for IIoT environments. Mech Syst Signal Process 138:106537

    Article  Google Scholar 

  26. Liu Y, Logan B, Liu N, Xu Z, Tang J, Wang Y (2017) Deep reinforcement learning for dynamic treatment regimes on medical registry data. In: 2017 IEEE international conference on healthcare informatics (ICHI). https://doi.org/10.1109/ICHI.2017.45, pp 380–385

  27. Liu Y, Njilla LL, Wang J, Song H (2019) An LSTM enabled dynamic stackelberg game theoretic method for resource allocation in the cloud. In: 2019 international conference on computing, networking and communications (ICNC). IEEE, pp 797–801

  28. Miotto R, Wang F, Wang S, Jiang X, Dudley JT (2017) Deep learning for healthcare: review, opportunities and challenges. Brief Bioinform, pp 1–11

  29. Mnih V, Kavukcuoglu K, Silver D, Rusu AA, Veness J, Bellemare MG, Graves A, Riedmiller M, Fidjeland AK, Ostrovski G (2015) Human-level control through deep reinforcement learning. Nature 518(7540):529C533

    Article  Google Scholar 

  30. Mould D (2012) Models for disease progression: new approaches and uses. Clinical Pharmacology & Therapeutics 92(1):125–131

    Article  Google Scholar 

  31. Nemati S, Ghassemi MM, Clifford GD (2016) Optimal medication dosing from suboptimal clinical examples: a deep reinforcement learning approach. In: Engineering in medicine and biology society, pp 2978–2981

  32. Nguyen P, Tran T, Wickramasinghe N, Venkatesh S (2017) Deepr: a convolutional net for medical records. IEEE J Biomed Health Inform 21(1):22–30

    Article  Google Scholar 

  33. Orphanou K, Stassopoulou A, Keravnou E (2014) Temporal abstraction and temporal bayesian networks in clinical domains: a survey. Artif Intell Med 60(3):133–149

    Article  Google Scholar 

  34. Pham T, Tran T, Phung D, Venkatesh S (2017) Predicting healthcare trajectories from medical records: a deep learning approach. J Biomed Inform 69:218–229

    Article  Google Scholar 

  35. Silver D, Schrittwieser J, Simonyan K, Antonoglou I, Huang A, Guez A, Hubert T, Baker L, Lai M, Bolton A, et al. (2017) Mastering the game of go without human knowledge. Nature 550(7676):354C359

    Article  Google Scholar 

  36. Sukkar R, Katz E, Zhang Y, Raunig D, Wyman BT (2012) Disease progression modeling using hidden markov models. In: 2012 annual international conference of the IEEE engineering in medicine and biology society. IEEE, pp 2845–2848

  37. Wang L, Zhang W, He X, Zha H (2018) Supervised reinforcement learning with recurrent neural network for dynamic treatment recommendation. In: Proceedings of the 24th ACM SIGKDD international conference on knowledge discovery & data mining. ACM, pp 2447–2456

  38. Wang X, Sontag D, Wang F (2014) Unsupervised learning of disease progression models. In: Proceedings of the 20th ACM SIGKDD international conference on knowledge discovery and data mining. ACM, New York, pp 85–94

  39. Zhang Y, Sun L, Song H, Cao X (2014) Ubiquitous WSN for healthcare: recent advances and future prospects. IEEE Internet of Things Journal 1(4):311–318

    Article  Google Scholar 

Download references

Acknowledgements

The authors thank the editor and anonymous reviewers for their helpful comments and valuable suggestions.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Khan Muhammad.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

This work is supported in part by the National Natural Science Foundation of China under Grant Numbers 61632009, the Guangdong Provincial Natural Science Foundation under Grant Number 2017A030308006, the High Level Talents Program of Higher Education in Guangdong Province under Funding Support Number 2016ZJ01, Hunan Provincial Science and Technology Project Foundation under Grant Numbers 2018TP1018 and 2018RS3065.

Appendix

Appendix

1.1 The generated tongue images of decoding network

Given a 9-dimensional health state vector, the decoding network will generate a 32 × 32 × 3 tongue image with corresponding features. The typical tongue images of nine BC types are illustrated in Fig. 8. The representation space of the generated tongue images by the trained decoding network are illustrated in Fig. 9.

Fig. 8
figure 8

Nine typical tongue images belong to different BC types. The leftmost tongue image is a balanced type, named Gentleness, and the other tongue images belong to unbalanced BC types

Fig. 9
figure 9

Illustration of the generated tongue images of a trained decoding network. The images of the leftmost column are gentleness type h(1) corresponding to a state vector (1,0,0,0,0,0,0,0,0), and the images of the rightmost column are other sub-healthy BC types, corresponding to \(\boldsymbol {h}^{(2)} \thicksim \boldsymbol {h}^{(9)}\). For each line, the images are generated from interpolated codes that change the representation vector of gentleness to vectors of the other BC type by a value of 0.1. E.g., in the first line, the latent state vector of the second tongue image from left is (0.9,0.1,0,0,0,0,0,0,0)

1.2 Different scales of regulating network

The scale of hidden layer in the regulating network can affect the complexity of the treatment. The intervention dimension also can affect the probability of finding a better healthcare strategy. We generate 5 random regulating networks for different scales respectively, and use the conceptual alignment DDPG to treat the same model five times. The best converged results are illustrated in Tables 910 and 11.

Table 9 The best converged results of 5 random regulating networks with 20-dimensional input and 20 hidden neurons
Table 10 The best converged results of 5 random regulating networks with 20-dimensional input and 50 hidden neurons
Table 11 The best converged results of 5 random regulating networks with 20-dimensional input and 100 hidden neurons
Table 12 The best converged results of 5 random regulating networks with 10-dimensional input and 20 hidden neurons
Table 13 The best converged results of 5 random regulating networks with 50-dimensional input and 20 hidden neurons

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Dai, Y., Wang, G., Muhammad, K. et al. A closed-loop healthcare processing approach based on deep reinforcement learning. Multimed Tools Appl 81, 3107–3129 (2022). https://doi.org/10.1007/s11042-020-08896-5

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-020-08896-5

Keywords

Navigation