Effective Skill Learning on Vascular Robotic Systems: Combining Offline and Online Reinforcement Learning

Li, Hao; Zhou, Xiao-Hu; Xie, Xiao-Liang; Liu, Shi-Qi; Gui, Mei-Jiang; Xiang, Tian-Yu; Huang, De-Xing; Hou, Zeng-Guang

doi:10.1007/978-981-99-8184-7_3

Hao Li^10,11,
Xiao-Hu Zhou^10,11,
Xiao-Liang Xie^10,11,
Shi-Qi Liu^10,11,
Mei-Jiang Gui^10,11,
Tian-Yu Xiang^10,11,
De-Xing Huang^10,11 &
…
Zeng-Guang Hou^10,11

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 1969))

Included in the following conference series:

International Conference on Neural Information Processing

383 Accesses

Abstract

Vascular robotic systems, which have gained popularity in clinic, provide a platform for potentially semi-automated surgery. Reinforcement learning (RL) is a appealing skill-learning method to facilitate automatic instrument delivery. However, the notorious sample inefficiency of RL has limited its application in this domain. To address this issue, this paper proposes a novel RL framework, Distributed Reinforcement learning with Adaptive Conservatism (DRAC), that learns manipulation skills with a modest amount of interactions. DRAC pretrains skills from rule-based interactions before online fine-tuning to utilize prior knowledge and improve sample efficiency. Moreover, DRAC uses adaptive conservatism to explore safely during online fine-tuning and a distributed structure to shorten training time. Experiments in a pre-clinical environment demonstrate that DRAC can deliver guidewire to the target with less dangerous exploration and better performance than prior methods (success rate of 96.00% and mean backward steps of 9.54) within 20k interactions. These results indicate that the proposed algorithm is promising to learn skills for vascular robotic systems.

This work was supported in part by the National Natural Science Foundation of China under Grant 62003343, Grant 62222316, Grant U1913601, Grant 62073325, Grant U20A20224, and Grant U1913210; in part by the Beijing Natural Science Foundation under Grant M22008; in part by the Youth Innovation Promotion Association of Chinese Academy of Sciences (CAS) under Grant 2020140; in part by the CIE-Tencent Robotics X Rhino-Bird Focused Research Program.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 79.99; Price excludes VAT (USA)

Softcover Book: USD 99.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Wang, H., et al.: Global, regional, and national life expectancy, all-cause mortality, and cause-specific mortality for 249 causes of death, 1980–2015: a systematic analysis for the global burden of disease study 2015. Lancet (London, England) 388, 1459–1544 (2016)
Article Google Scholar
Granada, J.F., et al.: First-in-human evaluation of a novel robotic-assisted coronary angioplasty system. J. Am. Coll. Cardiol. Intv. 4(4), 460–465 (2011)
Article Google Scholar
Guo, S., et al.: A novel robot-assisted endovascular catheterization system with haptic force feedback. IEEE Trans. Rob. 35(3), 685–696 (2019)
Article Google Scholar
Zhao, H.-L., et al.: Design and performance evaluation of a novel vascular robotic system for complex percutaneous coronary interventions. In: Proceedings of 43rd Annual International Conference of the IEEE Engineering in Medicine and Biology Society, pp. 4679–4682 (2021)
Google Scholar
Sutton, R.S., Barto, A.G.: Reinforcement Learning: An Introduction. MIT Press (2018)
Google Scholar
Chi, W., et al.: Collaborative robot-assisted endovascular catheterization with generative adversarial imitation learning. In: Proceedings of 2020 IEEE International Conference on Robotics and Automation, pp. 2414–2420 (2020)
Google Scholar
Karstensen, L., et al.: Autonomous guidewire navigation in a two dimensional vascular phantom. Current Dir. Biomed. Eng. 6, 20200007 (2020)
Article Google Scholar
Li, H., et al.: Discrete soft actor-critic with auto-encoder on vascular robotic system. Robotica 41, 1115–1126 (2022)
Google Scholar
Kweon, J., et al.: Deep reinforcement learning for guidewire navigation in coronary artery phantom. IEEE Access 9, 166409–166422 (2021)
Article Google Scholar
Li, H., Zhou, X.-H., Xie, X.-L., Liu, S.-Q., Feng, Z.-Q., Hou, Z.-G.: CASOG: conservative actor-critic with SmOoth gradient for skill learning in robot-assisted intervention. Arxiv (2020)
Google Scholar
Yarats, D., Fergus, R., Lazaric, A., Pinto, L.: Mastering visual continuous control: improved data-augmented reinforcement learning. ArXiv, abs/2107.09645 (2021)
Google Scholar
Nair, A., Dalal, M., Gupta, A., Levine, S.: Accelerating online reinforcement learning with offline datasets. ArXiv, abs/2006.09359 (2020)
Google Scholar
Lu, Y.: AW-Opt: learning robotic skills with imitation and reinforcement at scale. In: Conference on Robot Learning (2021)
Google Scholar
Kalashnikov, D.: QT-Opt: scalable deep reinforcement learning for vision-based robotic manipulation. ArXiv, abs/1806.10293 (2018)
Google Scholar
Fu, J., Kumar, A., Nachum, O., Tucker, G., Levine, S.: D4RL: datasets for deep data-driven reinforcement learning. ArXiv, abs/2004.07219 (2020)
Google Scholar
Fujimoto, S., van Hoof, H., Meger, D.: Addressing function approximation error in actor-critic methods. In: Proceedings of the 35th International Conference on Machine Learning, pp. 1582–1591 (2018)
Google Scholar
Cetin, E., Ball, P.J., Roberts, S.J., Çeliktutan, O.: Stabilizing off-policy deep reinforcement learning from pixels. In: Proceedings of the 39th International Conference on Machine Learning, pp. 2784–2810 (2022)
Google Scholar
Cheng, C.-A., Xie, T., Jiang, N., Agarwal, A.: Adversarially trained actor critic for offline reinforcement learning. In: Proceedings of the 39th International Conference on Machine Learning, pp. 3852–3878 (2022)
Google Scholar
Yarats, D., et al.: Improving sample efficiency in model-free reinforcement learning from images. In: Proceedings of 35th AAAI Conference on Artificial Intelligence, pp. 10674–10681 (2021)
Google Scholar
Moritz, P.: Ray: a distributed framework for emerging AI applications. Arxiv, abs/1712.05889 (2017)
Google Scholar

Download references

Author information

Authors and Affiliations

State Key Laboratory of Multimodal Artificial Intelligence Systems, Institute of Automation, Chinese Academy of Sciences, Beijing, 100190, China
Hao Li, Xiao-Hu Zhou, Xiao-Liang Xie, Shi-Qi Liu, Mei-Jiang Gui, Tian-Yu Xiang, De-Xing Huang & Zeng-Guang Hou
The School of Artificial Intelligence, University of Chinese Academy of Sciences, Beijing, 100049, China
Hao Li, Xiao-Hu Zhou, Xiao-Liang Xie, Shi-Qi Liu, Mei-Jiang Gui, Tian-Yu Xiang, De-Xing Huang & Zeng-Guang Hou

Authors

Hao Li
View author publications
You can also search for this author in PubMed Google Scholar
Xiao-Hu Zhou
View author publications
You can also search for this author in PubMed Google Scholar
Xiao-Liang Xie
View author publications
You can also search for this author in PubMed Google Scholar
Shi-Qi Liu
View author publications
You can also search for this author in PubMed Google Scholar
Mei-Jiang Gui
View author publications
You can also search for this author in PubMed Google Scholar
Tian-Yu Xiang
View author publications
You can also search for this author in PubMed Google Scholar
De-Xing Huang
View author publications
You can also search for this author in PubMed Google Scholar
Zeng-Guang Hou
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding authors

Correspondence to Xiao-Hu Zhou or Zeng-Guang Hou .

Editor information

Editors and Affiliations

School of Automation, Central South University, Changsha, China
Biao Luo
Institute of Automation, Chinese Academy of Sciences, Beijing, China
Long Cheng
Institute of Cyber-Systems and Control, Zhejiang University, Hangzhou, China
Zheng-Guang Wu
School of Automation, Guangdong University of Technology, Guangzhou, China
Hongyi Li
School of Electrical Engineering and Telecommunications, UNSW Sydney, Sydney, NSW, Australia
Chaojie Li

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Li, H. et al. (2024). Effective Skill Learning on Vascular Robotic Systems: Combining Offline and Online Reinforcement Learning. In: Luo, B., Cheng, L., Wu, ZG., Li, H., Li, C. (eds) Neural Information Processing. ICONIP 2023. Communications in Computer and Information Science, vol 1969. Springer, Singapore. https://doi.org/10.1007/978-981-99-8184-7_3

Download citation

DOI: https://doi.org/10.1007/978-981-99-8184-7_3
Published: 26 November 2023
Publisher Name: Springer, Singapore
Print ISBN: 978-981-99-8183-0
Online ISBN: 978-981-99-8184-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Effective Skill Learning on Vascular Robotic Systems: Combining Offline and Online Reinforcement Learning