Reinforcement Learning and Sim-to-Real Method of Dual-Arm Robot for Capturing Non-Cooperative Dynamic Targets

Du, Wenjuan; Li, Nan; Chen, Yeheng; Wang, Jiangping

doi:10.1007/978-981-99-6492-5_24

Wenjuan Du¹⁵,
Nan Li¹⁵,
Yeheng Chen¹⁵ &
…
Jiangping Wang¹⁵

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 14270))

Included in the following conference series:

International Conference on Intelligent Robotics and Applications

505 Accesses

Abstract

The gradual increase of space debris such as invalid satellites poses a great threat to human space exploration activities. Dual-arm robots are increasingly being used in on-orbit capture tasks due to their flexible and stable characteristics. Modeling and controlling a high-dimension dual-arm robot are difficult, and planning a collision-free path for it takes a long time, hence, it’s difficult to capture a dynamic non-cooperative target with a dual-arm robot. To address these problems, this paper proposes an intelligent capture algorithm based on the PPO algorithm with the A2C framework, as the reinforcement learning algorithm requires no model of the robot. Collision detection is introduced into the training so that the strategy network obtained from the training does not need real-time collision detection when it’s applied to a real robot, namely, it can output relevant control commands in real-time without path planning time. Furthermore, the randomization method improves the generalization ability of the model. The Actor-Networks have been tested in both simulations and on a real robot. The average capture rate is 96.8% in the simulation and a target with rotation speed in the range [\(-3.0, 3.0\)]\(^\circ \)/s can be caught in the real world, which proves the effectiveness of the intelligent capture algorithm proposed.

This Research Supported by Center-initiated Research Project of Zhejiang Lab (No. 2021NB0AL01); Science and Technology on Space Intelligent Control Laboratory (No. K2022EA2KE01).

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 79.99; Price excludes VAT (USA)

Softcover Book: USD 99.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Ai, H., Zhu, A., Wang, J., Yu, X., Chen, L.: Buffer compliance control of space robots capturing a non-cooperative spacecraft based on reinforcement learning. Appl. Sci. 11(13), 5783 (2021)
Article Google Scholar
Bylard, A., MacPherson, R., Hockman, B., Cutkosky, M.R., Pavone, M.: Robust capture and deorbit of rocket body debris using controllable dry adhesion. In: 2017 IEEE Aerospace Conference, pp. 1–9 (2017). https://doi.org/10.1109/AERO.2017.7943844
Dong, H., Gangqi, D., Huang, P., Zhiqing, M.: Capture and detumbling control for active debris removal by a dual-arm space robot. Chin. J. Aeronaut. 35(9), 342–353 (2022)
Article Google Scholar
Guang, Z., Jing-rui, Z.: Space tether net system for debris capture and removal. In: 2012 4th International Conference on Intelligent Human-Machine Systems and Cybernetics, vol. 1, pp. 257–261 (2012). https://doi.org/10.1109/IHMSC.2012.71
Ma, Z., Wang, Y., Yang, Y., Wang, Z., Tang, L., Ackland, S.: Reinforcement learning-based satellite attitude stabilization method for non-cooperative target capturing. Sensors 18(12), 4331 (2018)
Article Google Scholar
Makoviychuk, V., et al.: Isaac Gym: high performance GPU-based physics simulation for robot learning. arXiv preprint arXiv:2108.10470 (2021)
Man, W., Li, X., Zhang, Z., An, J., Zhang, G., Yu, D.: Research on space target on-orbit capturing methods. In: Tan, J. (ed.) ICMD 2021. MMS, vol. 111, pp. 321–343. Springer, Singapore (2022). https://doi.org/10.1007/978-981-16-7381-8_22
Chapter Google Scholar
Nishida, S.I., Kawamoto, S.: Strategy for capturing of a tumbling space debris. Acta Astronautica 68(1), 113–120 (2011). https://doi.org/10.1016/j.actaastro.2010.06.045. https://www.sciencedirect.com/science/article/pii/S0094576510002365
Schulman, J., Wolski, F., Dhariwal, P., Radford, A., Klimov, O.: Proximal policy optimization algorithms. arXiv preprint arXiv:1707.06347 (2017)
Xueyan, A., Zhang, R., Wei, L.: Terminal sliding mode control of attitude synchronization for autonomous docking to a tumbling satellite. In: Proceedings 2013 International Conference on Mechatronic Sciences, Electric Engineering and Computer (MEC), pp. 2760–2763. IEEE (2013)
Google Scholar
Yoshida, K., Dimitrov, D., Nakanishi, H.: On the capture of tumbling satellite by a space robot. In: 2006 IEEE/RSJ International Conference on Intelligent Robots and Systems, pp. 4127–4132 (2006). https://doi.org/10.1109/IROS.2006.281900
Zhu, A., Ai, H., Chen, L.: A fuzzy logic reinforcement learning control with spring-damper device for space robot capturing satellite. Appl. Sci. 12(5), 2662 (2022)
Article Google Scholar

Download references

Author information

Authors and Affiliations

Zhejiang Lab, Hangzhou, 311121, Zhejiang, China
Wenjuan Du, Nan Li, Yeheng Chen & Jiangping Wang

Authors

Wenjuan Du
View author publications
You can also search for this author in PubMed Google Scholar
Nan Li
View author publications
You can also search for this author in PubMed Google Scholar
Yeheng Chen
View author publications
You can also search for this author in PubMed Google Scholar
Jiangping Wang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Wenjuan Du .

Editor information

Editors and Affiliations

Zhejiang University, Hangzhou, China
Huayong Yang
Harbin Institute of Technology, Shenzhen, China
Honghai Liu
Zhejiang University, Hangzhou, China
Jun Zou
Huazhong University of Science and Technology, Wuhan, China
Zhouping Yin
Shenyang Institute of Automation, Shenyang, Liaoning, China
Lianqing Liu
Zhejiang University, Hangzhou, China
Geng Yang
Zhejiang University, Hangzhou, China
Xiaoping Ouyang
Harbin Institute of Technology, Shenzhen, China
Zhiyong Wang

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Du, W., Li, N., Chen, Y., Wang, J. (2023). Reinforcement Learning and Sim-to-Real Method of Dual-Arm Robot for Capturing Non-Cooperative Dynamic Targets. In: Yang, H., et al. Intelligent Robotics and Applications. ICIRA 2023. Lecture Notes in Computer Science(), vol 14270. Springer, Singapore. https://doi.org/10.1007/978-981-99-6492-5_24

Download citation

DOI: https://doi.org/10.1007/978-981-99-6492-5_24
Published: 16 October 2023
Publisher Name: Springer, Singapore
Print ISBN: 978-981-99-6491-8
Online ISBN: 978-981-99-6492-5
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Reinforcement Learning and Sim-to-Real Method of Dual-Arm Robot for Capturing Non-Cooperative Dynamic Targets