Fuzzy Deep Deterministic Policy Gradient-Based Motion Controller for Humanoid Robot

Kuo, Ping-Huan; Hu, Jun; Lin, Ssu-Ting; Hsu, Po-Wei

doi:10.1007/s40815-022-01293-0

Fuzzy Deep Deterministic Policy Gradient-Based Motion Controller for Humanoid Robot

Published: 19 April 2022

Volume 24, pages 2476–2492, (2022)
Cite this article

International Journal of Fuzzy Systems Aims and scope Submit manuscript

Ping-Huan Kuo^1,2,
Jun Hu³,
Ssu-Ting Lin³ &
…
Po-Wei Hsu³

346 Accesses
3 Citations
Explore all metrics

Abstract

In conventional robot arm control, inverse kinematics (IK) is used as the basis for computing arm joint angles. However, IK can be used to compute joint angles only after the terminal point has been reached, and it cannot optimize arm movements. Furthermore, the singularity problem is sometimes encountered when using IK. For example, if a robot arm in motion passes through a singularity, the next step’s movement is incomputable, which results in errors. Therefore, this study did not use IK for computing the joint angles of humanoid robot arms. Instead, this paper proposes a motion controller based on machine learning and fuzzy logic for the aforementioned purpose. Conventional reinforcement learning can provide satisfactory results for a single state but cannot be used to perform optimized calculations for infinite states. To solve this problem, this study used the deep deterministic policy gradient (DDPG) algorithm and allowed a humanoid robot to self-learn and autonomously plan the movement of and joint angles in its arm. A state and its action can be calculated in a hyperplane by using the developed neural network. A continuous mapping relationship exists between the state and its action in this hyperplane. Thus, the humanoid robot obtained optimal learning experiences in multiple self-learning processes. Finally, the concept was incorporated into a visual feedback system to achieve object grasping by the humanoid robot. The humanoid robot exhibited satisfactory learning outcomes—as well as satisfactory motion control performance—in experiments when combining fuzzy logic with the DDPG algorithm.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Robotic Arm Control and Task Training Through Deep Reinforcement Learning

Real-Time Neural-Net Driven Optimized Inverse Kinematics for a Robotic Manipulator Arm

Robot Path Planning via Deep Reinforcement Learning with Improved Reward Function

References

Kunal, K., Arfianto, A.Z., Poetro, J.E., Waseel, F., Atmoko, R.A.: Accelerometer implementation as feedback on 5 degree of freedom arm robot. J. Robot. Control. (2020). https://doi.org/10.18196/jrc.1107
Article Google Scholar
Latif, A., Shankar, K., Nguyen, P.T.: Legged fire fighter robot movement using PID. J. Robot. Control. (2020). https://doi.org/10.18196/jrc.1104
Article Google Scholar
Prasojo, I., Nguyen, P.T., Tanane, O., Shahu, N.: Design of ultrasonic sensor and ultraviolet sensor implemented on a fire fighter robot using AT89S52. J. Robot. Control. (2020). https://doi.org/10.18196/jrc.1212
Article Google Scholar
Latif, A., Widodo, H.A., Rahim, R., Kunal, K.: Implementation of line follower robot based microcontroller ATMega32A. J. Robot. Control. (2020). https://doi.org/10.18196/jrc.1316
Article Google Scholar
Hassan, A., Abdullah, H.M., Farooq, U., Shahzad, A., Muhammad, R., Haider, A.F., UrRehman, A.: A wirelessly controlled robot-based smart irrigation system by exploiting arduino. J. Robot. Control. (2020). https://doi.org/10.18196/jrc.2148
Article Google Scholar
Rijalusalam, D.U., Iswanto, I.: Implementation kinematics modeling and odometry of four omni wheel mobile robot on the trajectory planning and motion control based microcontroller. J. Robot. Control (2021). https://doi.org/10.18196/jrc.25121
Article Google Scholar
Saputra, A.A., Botzheim, J., Sulistijono, I.A., Kubota, N.: Biologically inspired control system for 3-d locomotion of a humanoid biped robot. IEEE Trans. Syst. Man Cybern. Syst. 46, 898–911 (2016). https://doi.org/10.1109/TSMC.2015.2497250
Article Google Scholar
Hwang, K.S., Lin, J.L., Yeh, K.H.: Learning to adjust and refine gait patterns for a biped robot. IEEE Trans. Syst. Man Cybern. Syst. 45, 1481–1490 (2015). https://doi.org/10.1109/TSMC.2015.2418321
Article Google Scholar
Tran, D.H., Hamker, F., Nassour, J.: A humanoid robot learns to recover perturbation during swinging motion. IEEE Trans. Syst. Man Cybern. Syst. (2018). https://doi.org/10.1109/TSMC.2018.2884619
Article Google Scholar
He, W., Ge, W., Li, Y., Liu, Y.J., Yang, C., Sun, C.: Model identification and control design for a humanoid robot. IEEE Trans. Syst. Man Cybern. Syst. 47, 45–57 (2017). https://doi.org/10.1109/TSMC.2016.2557227
Article Google Scholar
Jin, Y., Lee, M.: Enhancing binocular depth estimation based on proactive perception and action cyclic learning for an autonomous developmental robot. IEEE Trans. Syst. Man Cybern. Syst. 49, 169–180 (2019). https://doi.org/10.1109/TSMC.2017.2779474
Article Google Scholar
Shirafuji, S., Ota, J.: Kinematic synthesis of a serial robotic manipulator by using generalized differential inverse kinematics. IEEE Trans. Robot. (2019). https://doi.org/10.1109/TRO.2019.2907810
Article Google Scholar
Gong, M., Li, X., Zhang, L.: analytical inverse kinematics and self-motion application for 7-DOF redundant manipulator. IEEE Access. 7, 18662–18674 (2019). https://doi.org/10.1109/ACCESS.2019.2895741
Article Google Scholar
An, S.I., Lee, D.: Prioritized inverse kinematics: generalization. IEEE Robot. Autom. Lett. 4, 3537–3544 (2019). https://doi.org/10.1109/LRA.2019.2927945
Article Google Scholar
Brahmi, B., Saad, M., Rahman, M.H., Ochoa-Luna, C.: Cartesian trajectory tracking of a 7-DOF exoskeleton robot based on human inverse kinematics. IEEE Trans. Syst. Man Cybern. Syst. 49, 600–611 (2019). https://doi.org/10.1109/TSMC.2017.2695003
Article Google Scholar
Faraji, S., Ijspeert, A.J.: Singularity-tolerant inverse kinematics for bipedal robots: an efficient use of computational power to reduce energy consumption. IEEE Robot. Autom. Lett. 2, 1132–1139 (2017). https://doi.org/10.1109/LRA.2017.2661810
Article Google Scholar
Liu, T., Jackson, R., Franson, D., Poirot, N.L., Criss, R.K., Seiberlich, N., Griswold, M.A., Cavusoglu, M.C.: Iterative jacobian-based inverse kinematics and open-loop control of an MRI-guided magnetically actuated steerable catheter system. IEEE/ASME Trans. Mechatronics. 22, 1765–1776 (2017). https://doi.org/10.1109/TMECH.2017.2704526
Article Google Scholar
Sheng, W., Thobbi, A., Gu, Y.: An integrated framework for human–robot collaborative manipulation. IEEE Trans. Cybern. 45, 2030–2041 (2015). https://doi.org/10.1109/TCYB.2014.2363664
Article Google Scholar
Lin, J.-L., Hwang, K.-S.: Balancing and reconstruction of segmented postures for humanoid robots in imitation of motion. IEEE Access. 5, 17534–17542 (2017). https://doi.org/10.1109/ACCESS.2017.2743068
Article Google Scholar
Miura, K., Matsui, A., Katsura, S.: Synthesis of motion-reproduction systems based on motion-copying system considering control stiffness. IEEE/ASME Trans. Mechatronics 21, 1015–1023 (2016). https://doi.org/10.1109/TMECH.2015.2478897
Article Google Scholar
Huang, J., Huo, W., Xu, W., Mohammed, S., Amirat, Y.: Control of upper-limb power-assist exoskeleton using a human-robot interface based on motion intention recognition. IEEE Trans. Autom. Sci. Eng. 12, 1257–1270 (2015). https://doi.org/10.1109/TASE.2015.2466634
Article Google Scholar
Ferrara, A., Incremona, G.P.: Design of an integral suboptimal second-order sliding mode controller for the robust motion control of robot manipulators. IEEE Trans. Control Syst. Technol. 23, 2316–2325 (2015). https://doi.org/10.1109/TCST.2015.2420624
Article Google Scholar
Ostad-Ali-Askari, K., Shayan, M.: Subsurface drain spacing in the unsteady conditions by HYDRUS-3D and artificial neural networks. Arab. J. Geosci. 14, 1936 (2021). https://doi.org/10.1007/s12517-021-08336-0
Article Google Scholar
Lillicrap, T.P., Hunt, J.J., Pritzel, A., Heess, N., Erez, T., Tassa, Y., Silver, D., Wierstra, D.: Continuous control with deep reinforcement learning. arXiv. (2015)
Luo, X., Zhang, Y., He, Z., Yang, G., Ji, Z.: A two-step environment-learning-based method for optimal UAV deployment. IEEE Access. 7, 1–1 (2019). https://doi.org/10.1109/access.2019.2947546
Article Google Scholar
Hu, K., Xu, Z.X., Yang, W., Xu, B.: Build the structure of WFSless AO system through deep reinforcement learning. IEEE Photonics Technol. Lett. 30, 2033–2036 (2018). https://doi.org/10.1109/LPT.2018.2874998
Article Google Scholar
Qiu, C., Hu, Y., Chen, Y., Zeng, B.: Deep deterministic policy gradient (DDPG)-based energy harvesting wireless communications. IEEE Internet Things J. 6, 8577–8588 (2019). https://doi.org/10.1109/JIOT.2019.2921159
Article Google Scholar
Xu, H., Sun, H., Nikovski, D., Kitamura, S., Mori, K., Hashimoto, H.: Deep reinforcement learning for joint bidding and pricing of load serving entity. IEEE Trans. Smart Grid. 10, 1–1 (2019). https://doi.org/10.1109/tsg.2019.2903756
Article Google Scholar
Nguyen, K.K., Duong, T.Q., Vien, N.A., Le-Khac, N.A., Nguyen, L.D.: Distributed deep deterministic policy gradient for power allocation control in D2D-based V2V communications. IEEE Access. 7, 64533–64543 (2019). https://doi.org/10.1109/ACCESS.2019.2952411
Article Google Scholar
Tang, Y., Guo, H., Yuan, T., Gao, X., Hong, X., Li, Y., Qiu, J., Zuo, Y., Wu, J.: Flow splitter: a deep reinforcement learning-based flow scheduler for hybrid optical-electrical data center network. IEEE Access. 7, 129955–129965 (2019). https://doi.org/10.1109/access.2019.2940445
Article Google Scholar
Lobos-Tsunekawa, K., Leiva, F., Ruiz-Del-Solar, J.: Visual navigation for biped humanoid robots using deep reinforcement learning. IEEE Robot. Autom. Lett. 3, 3247–3254 (2018). https://doi.org/10.1109/LRA.2018.2851148
Article Google Scholar
Deshpande, S.: How to train your Cheetah with Deep Reinforcement Learning, https://medium.com/@deshpandeshrinath/how-to-train-your-cheetah-with-deep-reinforcement-learning-14855518f916
Experimental Video, https://youtu.be/GfJqtXotyuI
Mnih, V., Badia, A.P., Mirza, M., Graves, A., Lillicrap, T.P., Harley, T., Silver, D., Kavukcuoglu, K.: Asynchronous methods for deep reinforcement learning. arXiv (2016)
Schulman, J., Wolski, F., Dhariwal, P., Radford, A., Klimov, O.: Proximal policy optimization algorithms. arXiv (2017)
Haarnoja, T., Zhou, A., Abbeel, P., Levine, S.: Soft actor-critic: off-policy maximum entropy deep reinforcement learning with a stochastic actor. arXiv (2018)
Fujimoto, S., van Hoof, H., Meger, D.: Addressing function approximation error in actor-critic methods. arXiv (2018)
Redmon, J., Farhadi, A.: YOLO9000: Better, faster, stronger. arXiv (2016)
Ioffe, S., Szegedy, C.: Batch normalization: accelerating deep network training by reducing internal covariate shift. arXiv (2015)
Cheng, Y., Sun, Z., Huang, Y., Zhang, W.: Fuzzy categorical deep reinforcement learning of a defensive game for an unmanned surface vessel. Int. J. Fuzzy Syst. 21, 592–606 (2019). https://doi.org/10.1007/s40815-018-0586-0
Article Google Scholar
Awheda, M.D., Schwartz, H.M.: A residual gradient fuzzy reinforcement learning algorithm for differential games. Int. J. Fuzzy Syst. 19, 1058–1076 (2017). https://doi.org/10.1007/s40815-016-0284-8
Article MathSciNet Google Scholar
Hwang, K.-S., Lin, J.-L., Shi, H., Chen, Y.-Y.: Policy learning with human reinforcement. Int. J. Fuzzy Syst. 18, 618–629 (2016). https://doi.org/10.1007/s40815-016-0194-9
Article MathSciNet Google Scholar
Pan, W., Qu, R., Hwang, K.-S., Lin, H.-S.: An ensemble fuzzy approach for inverse reinforcement learning. Int. J. Fuzzy Syst. 21, 95–103 (2019). https://doi.org/10.1007/s40815-018-0535-y
Article MathSciNet Google Scholar

Download references

Funding

This work was supported by the Ministry of Science and Technology, Taiwan, under Grants MOST 109–2221-E-194-053-MY3.

Author information

Authors and Affiliations

Department of Mechanical Engineering, National Chung Cheng University, Chiayi, 62102, Taiwan
Ping-Huan Kuo
Advanced Institute of Manufacturing with High-Tech Innovations (AIM-HI), National Chung Cheng University, Chiayi, 62102, Taiwan
Ping-Huan Kuo
Department of Intelligent Robotics, National Pingtung University, Pingtung, 90004, Taiwan
Jun Hu, Ssu-Ting Lin & Po-Wei Hsu

Authors

Ping-Huan Kuo
View author publications
You can also search for this author in PubMed Google Scholar
Jun Hu
View author publications
You can also search for this author in PubMed Google Scholar
Ssu-Ting Lin
View author publications
You can also search for this author in PubMed Google Scholar
Po-Wei Hsu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Ping-Huan Kuo.

Appendix

See Tables 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12 and 13.

Table 3 The error of goal A in y-axis

Full size table

Table 4 The error of goal A in z-axis

Full size table

Table 5 The error of goal B in x-axis

Full size table

Table 6 The error of goal B in y-axis

Full size table

Table 7 The error of goal B in z-axis

Full size table

Table 8 The error of goal C in x-axis

Full size table

Table 9 The error of goal C in y-axis

Full size table

Table 10 The error of goal C in z-axis

Full size table

Table 11 The error of goal D in terms of MAE

Full size table

Table 12 The error of goal E in terms of MAE

Full size table

Table 13 The error of goal F in terms of MAE

Full size table

Rights and permissions

Reprints and permissions

About this article

Cite this article

Kuo, PH., Hu, J., Lin, ST. et al. Fuzzy Deep Deterministic Policy Gradient-Based Motion Controller for Humanoid Robot. Int. J. Fuzzy Syst. 24, 2476–2492 (2022). https://doi.org/10.1007/s40815-022-01293-0

Download citation

Received: 27 October 2020
Revised: 13 November 2021
Accepted: 30 November 2021
Published: 19 April 2022
Issue Date: July 2022
DOI: https://doi.org/10.1007/s40815-022-01293-0

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fuzzy Deep Deterministic Policy Gradient-Based Motion Controller for Humanoid Robot

Abstract

Access this article

Similar content being viewed by others

Robotic Arm Control and Task Training Through Deep Reinforcement Learning

Real-Time Neural-Net Driven Optimized Inverse Kinematics for a Robotic Manipulator Arm

Robot Path Planning via Deep Reinforcement Learning with Improved Reward Function

References

Funding

Author information

Authors and Affiliations

Corresponding author

Appendix

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Fuzzy Deep Deterministic Policy Gradient-Based Motion Controller for Humanoid Robot

Abstract

Access this article

Similar content being viewed by others

Robotic Arm Control and Task Training Through Deep Reinforcement Learning

Real-Time Neural-Net Driven Optimized Inverse Kinematics for a Robotic Manipulator Arm

Robot Path Planning via Deep Reinforcement Learning with Improved Reward Function

References

Funding

Author information

Authors and Affiliations

Corresponding author

Appendix

Appendix

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation