An adaptive adjustment strategy for bolt posture errors based on an improved reinforcement learning algorithm

Luo, Wentao; Zhang, Jianfu; Feng, Pingfa; Liu, Haochen; Yu, Dingwen; Wu, Zhijun

doi:10.1007/s10489-020-01906-x

An adaptive adjustment strategy for bolt posture errors based on an improved reinforcement learning algorithm

Published: 13 November 2020

Volume 51, pages 3405–3420, (2021)
Cite this article

Applied Intelligence Aims and scope Submit manuscript

Wentao Luo ORCID: orcid.org/0000-0001-6579-2190¹,
Jianfu Zhang^1,2,
Pingfa Feng^1,2,3,
Haochen Liu⁴,
Dingwen Yu¹ &
…
Zhijun Wu¹

602 Accesses
6 Citations
Explore all metrics

Abstract

Designing an intelligent and autonomous system remains a great challenge in the assembly field. Most reinforcement learning (RL) methods are applied to experiments with relatively small state spaces. However, the complicated situation and high-dimensional spaces of the assembly environment cause traditional RL methods to behave poorly in terms of their efficiency and accuracy. In this paper, a model-driven adaptive proximal proximity optimization (MAPPO) method was presented to make the assembly system autonomously rectify the bolt posture error. In the MAPPO method, a probabilistic tree and adaptive reward mechanism were used to improve the calculation efficiency and accuracy of the traditional PPO method. The size of the action space was reduced by establishing a hierarchical logical relationship for each parameter with a probabilistic tree. Based on an adaptive reward mechanism, the phenomenon that the algorithm easily falls into local minima could be improved. Finally, the proposed method was verified based on the Unity simulation engine. The advancement and robustness of the proposed model were also validated by comparing different cases in simulations and experiments. The results revealed that MAPPO has better algorithm efficiency and accuracy compared with other state-of-the-art algorithms.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Embodied intelligence in manufacturing: leveraging large language models for autonomous industrial robotics

Article 09 January 2024

Haolin Fan, Xuan Liu, … Bingbing Li

A review of motion planning algorithms for intelligent robots

Article Open access 25 November 2021

Chengmin Zhou, Bingding Huang & Pasi Fränti

Industrial Robotics

References

Acharya UR, Fujita H, Lih OS, Hagiwara Y, Tan JH, Adam M (2017) Automated detection of arrhythmias using different intervals of tachycardia ECG segments with convolutional neural network. Inf Sci 405:81–90. https://doi.org/10.1016/j.ins.2017.04.012
Article Google Scholar
Sudarshan VK, Mookiah MRK, Acharya UR, Chandran V, Molinari F, Fujita H, Ng KH (2016) Application of wavelet techniques for cancer diagnosis using ultrasound images: a review. Comput Biol Med 69:97–111. https://doi.org/10.1016/j.compbiomed.2015.12.006
Article Google Scholar
Acharya UR, Fujita H, Oh SL, Hagiwara Y, Tan JH, Adam M (2017) Application of deep convolutional neural network for automated detection of myocardial infarction using ECG signals. Inf Sci 415-416:190–198. https://doi.org/10.1016/j.ins.2017.06.027
Article Google Scholar
Capuano N, Chiclana F, Fujita H, Herrera-Viedma E, Loia V (2018) Fuzzy group decision making with incomplete information guided by social influence. IEEE Trans Fuzzy Syst 26(3):1704–1718. https://doi.org/10.1109/TFUZZ.2017.2744605
Article Google Scholar
Protopapadakis E, Voulodimos A, Doulamis A, Doulamis N, Stathaki T (2019) Automatic crack detection for tunnel inspection using deep learning and heuristic image post-processing. Appl Intell 49(7):2793–2806. https://doi.org/10.1007/s10489-018-01396-y
Article Google Scholar
Villalonga A, Beruvides G, Castaño F, Haber RE (2020) Cloud-based industrial cyber–physical system for data-driven reasoning: a review and use case on an industry 4.0 pilot line. IEEE Trans Ind Informatics 16(9):5975–5984. https://doi.org/10.1109/TII.2020.2971057
Article Google Scholar
Gullapalli V, Franklin JA, Benbrahim H (1994) Acquiring robot SKILLS via reinforcement learning. IEEE Control Syst Mag 14(1):13–24. https://doi.org/10.1109/37.257890
Article Google Scholar
Yang BH, Asada H (1996) Progressive learning and its application to robot impedance learning. IEEE Trans Neural Netw 7(4):941–952. https://doi.org/10.1109/72.508937
Article Google Scholar
Nuttin M, VanBrussel H (1997) Learning the peg-into-hole assembly operation with a connectionist reinforcement technique. Comput Ind 33(1):101–109. https://doi.org/10.1016/s0166-3615(97)00015-8
Article Google Scholar
Sutton RS, Barto AG (1998) Reinforcement learning: an introduction. IEEE Trans Neural Netw 9(5):1054–1054. https://doi.org/10.1109/TNN.1998.712192
Article Google Scholar
Ding S, Du W, Zhao X, Wang L, Jia W (2019) A new asynchronous reinforcement learning algorithm based on improved parallel PSO. Appl Intell 49(12):4211–4222. https://doi.org/10.1007/s10489-019-01487-4
Article Google Scholar
Liu P, Zhao Y, Zhao W, Tang X, Yang Z (2019) An exploratory rollout policy for imagination-augmented agents. Appl Intell 49(10):3749–3764. https://doi.org/10.1007/s10489-019-01484-7
Article Google Scholar
Yang CG, Zeng C, Cong Y, Wang N, Wang M (2019) A learning framework of adaptive manipulative Skills from human to robot. Ieee Trans Ind Informatics 15(2):1153–1161. https://doi.org/10.1109/tii.2018.2826064
Article Google Scholar
Wan A, Xu J, Chen HP, Zhang S, Chen K (2017) Optimal path planning and control of assembly robots for hard-measuring easy-deformation assemblies. Ieee-Asme Trans Mechatron 22(4):1600–1609. https://doi.org/10.1109/tmech.2017.2671342
Article Google Scholar
Xu J, Hou ZM, Wang W, Xu BH, Zhang KG, Chen K (2019) Feedback deep deterministic policy gradient with fuzzy reward for robotic multiple peg-in-hole assembly tasks. Ieee Trans Ind Informatics 15(3):1658–1667. https://doi.org/10.1109/tii.2018.2868859
Article Google Scholar
Young Ho K, Lewis FL (2000) Reinforcement adaptive learning neural-net-based friction compensation control for high speed and precision. IEEE Trans Control Syst Technol 8(1):118–126. https://doi.org/10.1109/87.817697
Article Google Scholar
Deisenroth MP, Fox D, Rasmussen CE (2015) Gaussian processes for data-efficient learning in robotics and control. IEEE Trans Pattern Anal Mach Intell 37(2):408–423. https://doi.org/10.1109/TPAMI.2013.218
Article Google Scholar
Schulman J, Wolski F, Dhariwal P, Radford A, Klimov O (2017) Proximal policy optimization algorithms. arXiv e-prints
Mnih V, Badia AP, Mirza M, Graves A, Harley T, Lillicrap TP, Silver D, Kavukcuoglu K (2016) Asynchronous methods for deep reinforcement learning, vol 48. Paper presented at the proceedings of the 33rd International Conference on International Conference on Machine Learning, New York
Google Scholar
Wang Z, Bapst V, Heess N, Mnih V, Munos R, Kavukcuoglu K, de Freitas N (2016) Sample efficient actor-critic with experience replay
Mousavi SS, Schukat M, Howley E (2018) Deep reinforcement learning: an overview. In: Bi Y, Kapoor S, Bhatia R (eds) Proceedings of Sai Intelligent Systems Conference, vol 16. Lecture Notes in Networks and Systems. pp 426-440. https://doi.org/10.1007/978-3-319-56991-8_32
Singh S, Lewis RL, Barto AG, Sorg J (2010) Intrinsically motivated reinforcement learning: an evolutionary perspective. IEEE Trans Auton Ment Dev 2(2):70–82. https://doi.org/10.1109/tamd.2010.2051031
Article Google Scholar
Mnih V, Kavukcuoglu K, Silver D, Rusu AA, Veness J, Bellemare MG, Graves A, Riedmiller M, Fidjeland AK, Ostrovski G, Petersen S, Beattie C, Sadik A, Antonoglou I, King H, Kumaran D, Wierstra D, Legg S, Hassabis D (2015) Human-level control through deep reinforcement learning. Nature 518(7540):529–533. https://doi.org/10.1038/nature14236
Article Google Scholar
Gershman SJ, Daw ND (2017) Reinforcement learning and episodic memory in humans and animals: an integrative framework. Annu Rev Psychol 68:101–128. https://doi.org/10.1146/annurev-psych-122414-033625
Article Google Scholar
Lewis FL, Vamvoudakis KG (2011) Reinforcement learning for partially observable dynamic processes: adaptive dynamic programming using measured output data. IEEE Trans Syst Man Cybernetics Part B (Cybernetics) 41(1):14–25. https://doi.org/10.1109/TSMCB.2010.2043839
Article Google Scholar
Lillicrap TP, Hunt JJ, Pritzel A, Heess N, Erez T, Tassa Y, Silver D, Wierstra D (2015) Continuous control with deep reinforcement learning. Computer Science
Haarnoja T, Zhou A, Abbeel P, Levine S (2018) Soft actor-critic: off-policy maximum entropy deep reinforcement learning with a stochastic actor. Paper presented at the Proceedings of the 35th International Conference on Machine Learning, Proceedings of Machine Learning Research,
Schulman J, Levine S, Moritz P, Jordan MI, Abbeel P (2015) Trust region policy optimization. arXiv e-prints:arXiv:1502.05477

Download references

Acknowledgments

We gratefully acknowledge the financial support from the National Defence Basic Scientific Research Program of China (JCKY2018208A001), and Tsinghua University-Weichai Power Joint Institute of Intelligent Manufacturing (JIIM02).

Author information

Authors and Affiliations

Department of Mechanical Engineering, Tsinghua University, Beijing, 100084, China
Wentao Luo, Jianfu Zhang, Pingfa Feng, Dingwen Yu & Zhijun Wu
State Key Laboratory of Tribology, and Beijing Key Lab of Precision/Ultra-precision Manufacturing Equipment and Control, Tsinghua University, Beijing, 100084, China
Jianfu Zhang & Pingfa Feng
Division of Advanced Manufacturing, Graduate School at Shenzhen, Tsinghua University, Shenzhen, 518055, China
Pingfa Feng
School of Automation Science and Electrical Engineering, Beihang University, Beijing, 100191, China
Haochen Liu

Authors

Wentao Luo
View author publications
You can also search for this author in PubMed Google Scholar
Jianfu Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Pingfa Feng
View author publications
You can also search for this author in PubMed Google Scholar
Haochen Liu
View author publications
You can also search for this author in PubMed Google Scholar
Dingwen Yu
View author publications
You can also search for this author in PubMed Google Scholar
Zhijun Wu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Jianfu Zhang.

Ethics declarations

Declaration of interests

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Luo, W., Zhang, J., Feng, P. et al. An adaptive adjustment strategy for bolt posture errors based on an improved reinforcement learning algorithm. Appl Intell 51, 3405–3420 (2021). https://doi.org/10.1007/s10489-020-01906-x

Download citation

Published: 13 November 2020
Issue Date: June 2021
DOI: https://doi.org/10.1007/s10489-020-01906-x

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

An adaptive adjustment strategy for bolt posture errors based on an improved reinforcement learning algorithm

Abstract

Access this article

Similar content being viewed by others

Embodied intelligence in manufacturing: leveraging large language models for autonomous industrial robotics

A review of motion planning algorithms for intelligent robots

Industrial Robotics

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Declaration of interests

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

An adaptive adjustment strategy for bolt posture errors based on an improved reinforcement learning algorithm

Abstract

Access this article

Similar content being viewed by others

Embodied intelligence in manufacturing: leveraging large language models for autonomous industrial robotics

A review of motion planning algorithms for intelligent robots

Industrial Robotics

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Declaration of interests

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation