Skip to main content
Log in

Fuzzy Deep Deterministic Policy Gradient-Based Motion Controller for Humanoid Robot

  • Published:
International Journal of Fuzzy Systems Aims and scope Submit manuscript

Abstract

In conventional robot arm control, inverse kinematics (IK) is used as the basis for computing arm joint angles. However, IK can be used to compute joint angles only after the terminal point has been reached, and it cannot optimize arm movements. Furthermore, the singularity problem is sometimes encountered when using IK. For example, if a robot arm in motion passes through a singularity, the next step’s movement is incomputable, which results in errors. Therefore, this study did not use IK for computing the joint angles of humanoid robot arms. Instead, this paper proposes a motion controller based on machine learning and fuzzy logic for the aforementioned purpose. Conventional reinforcement learning can provide satisfactory results for a single state but cannot be used to perform optimized calculations for infinite states. To solve this problem, this study used the deep deterministic policy gradient (DDPG) algorithm and allowed a humanoid robot to self-learn and autonomously plan the movement of and joint angles in its arm. A state and its action can be calculated in a hyperplane by using the developed neural network. A continuous mapping relationship exists between the state and its action in this hyperplane. Thus, the humanoid robot obtained optimal learning experiences in multiple self-learning processes. Finally, the concept was incorporated into a visual feedback system to achieve object grasping by the humanoid robot. The humanoid robot exhibited satisfactory learning outcomes—as well as satisfactory motion control performance—in experiments when combining fuzzy logic with the DDPG algorithm.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16
Fig. 17
Fig. 18
Fig. 19
Fig. 20
Fig. 21

Similar content being viewed by others

References

  1. Kunal, K., Arfianto, A.Z., Poetro, J.E., Waseel, F., Atmoko, R.A.: Accelerometer implementation as feedback on 5 degree of freedom arm robot. J. Robot. Control. (2020). https://doi.org/10.18196/jrc.1107

    Article  Google Scholar 

  2. Latif, A., Shankar, K., Nguyen, P.T.: Legged fire fighter robot movement using PID. J. Robot. Control. (2020). https://doi.org/10.18196/jrc.1104

    Article  Google Scholar 

  3. Prasojo, I., Nguyen, P.T., Tanane, O., Shahu, N.: Design of ultrasonic sensor and ultraviolet sensor implemented on a fire fighter robot using AT89S52. J. Robot. Control. (2020). https://doi.org/10.18196/jrc.1212

    Article  Google Scholar 

  4. Latif, A., Widodo, H.A., Rahim, R., Kunal, K.: Implementation of line follower robot based microcontroller ATMega32A. J. Robot. Control. (2020). https://doi.org/10.18196/jrc.1316

    Article  Google Scholar 

  5. Hassan, A., Abdullah, H.M., Farooq, U., Shahzad, A., Muhammad, R., Haider, A.F., UrRehman, A.: A wirelessly controlled robot-based smart irrigation system by exploiting arduino. J. Robot. Control. (2020). https://doi.org/10.18196/jrc.2148

    Article  Google Scholar 

  6. Rijalusalam, D.U., Iswanto, I.: Implementation kinematics modeling and odometry of four omni wheel mobile robot on the trajectory planning and motion control based microcontroller. J. Robot. Control (2021). https://doi.org/10.18196/jrc.25121

    Article  Google Scholar 

  7. Saputra, A.A., Botzheim, J., Sulistijono, I.A., Kubota, N.: Biologically inspired control system for 3-d locomotion of a humanoid biped robot. IEEE Trans. Syst. Man Cybern. Syst. 46, 898–911 (2016). https://doi.org/10.1109/TSMC.2015.2497250

    Article  Google Scholar 

  8. Hwang, K.S., Lin, J.L., Yeh, K.H.: Learning to adjust and refine gait patterns for a biped robot. IEEE Trans. Syst. Man Cybern. Syst. 45, 1481–1490 (2015). https://doi.org/10.1109/TSMC.2015.2418321

    Article  Google Scholar 

  9. Tran, D.H., Hamker, F., Nassour, J.: A humanoid robot learns to recover perturbation during swinging motion. IEEE Trans. Syst. Man Cybern. Syst. (2018). https://doi.org/10.1109/TSMC.2018.2884619

    Article  Google Scholar 

  10. He, W., Ge, W., Li, Y., Liu, Y.J., Yang, C., Sun, C.: Model identification and control design for a humanoid robot. IEEE Trans. Syst. Man Cybern. Syst. 47, 45–57 (2017). https://doi.org/10.1109/TSMC.2016.2557227

    Article  Google Scholar 

  11. Jin, Y., Lee, M.: Enhancing binocular depth estimation based on proactive perception and action cyclic learning for an autonomous developmental robot. IEEE Trans. Syst. Man Cybern. Syst. 49, 169–180 (2019). https://doi.org/10.1109/TSMC.2017.2779474

    Article  Google Scholar 

  12. Shirafuji, S., Ota, J.: Kinematic synthesis of a serial robotic manipulator by using generalized differential inverse kinematics. IEEE Trans. Robot. (2019). https://doi.org/10.1109/TRO.2019.2907810

    Article  Google Scholar 

  13. Gong, M., Li, X., Zhang, L.: analytical inverse kinematics and self-motion application for 7-DOF redundant manipulator. IEEE Access. 7, 18662–18674 (2019). https://doi.org/10.1109/ACCESS.2019.2895741

    Article  Google Scholar 

  14. An, S.I., Lee, D.: Prioritized inverse kinematics: generalization. IEEE Robot. Autom. Lett. 4, 3537–3544 (2019). https://doi.org/10.1109/LRA.2019.2927945

    Article  Google Scholar 

  15. Brahmi, B., Saad, M., Rahman, M.H., Ochoa-Luna, C.: Cartesian trajectory tracking of a 7-DOF exoskeleton robot based on human inverse kinematics. IEEE Trans. Syst. Man Cybern. Syst. 49, 600–611 (2019). https://doi.org/10.1109/TSMC.2017.2695003

    Article  Google Scholar 

  16. Faraji, S., Ijspeert, A.J.: Singularity-tolerant inverse kinematics for bipedal robots: an efficient use of computational power to reduce energy consumption. IEEE Robot. Autom. Lett. 2, 1132–1139 (2017). https://doi.org/10.1109/LRA.2017.2661810

    Article  Google Scholar 

  17. Liu, T., Jackson, R., Franson, D., Poirot, N.L., Criss, R.K., Seiberlich, N., Griswold, M.A., Cavusoglu, M.C.: Iterative jacobian-based inverse kinematics and open-loop control of an MRI-guided magnetically actuated steerable catheter system. IEEE/ASME Trans. Mechatronics. 22, 1765–1776 (2017). https://doi.org/10.1109/TMECH.2017.2704526

    Article  Google Scholar 

  18. Sheng, W., Thobbi, A., Gu, Y.: An integrated framework for human–robot collaborative manipulation. IEEE Trans. Cybern. 45, 2030–2041 (2015). https://doi.org/10.1109/TCYB.2014.2363664

    Article  Google Scholar 

  19. Lin, J.-L., Hwang, K.-S.: Balancing and reconstruction of segmented postures for humanoid robots in imitation of motion. IEEE Access. 5, 17534–17542 (2017). https://doi.org/10.1109/ACCESS.2017.2743068

    Article  Google Scholar 

  20. Miura, K., Matsui, A., Katsura, S.: Synthesis of motion-reproduction systems based on motion-copying system considering control stiffness. IEEE/ASME Trans. Mechatronics 21, 1015–1023 (2016). https://doi.org/10.1109/TMECH.2015.2478897

    Article  Google Scholar 

  21. Huang, J., Huo, W., Xu, W., Mohammed, S., Amirat, Y.: Control of upper-limb power-assist exoskeleton using a human-robot interface based on motion intention recognition. IEEE Trans. Autom. Sci. Eng. 12, 1257–1270 (2015). https://doi.org/10.1109/TASE.2015.2466634

    Article  Google Scholar 

  22. Ferrara, A., Incremona, G.P.: Design of an integral suboptimal second-order sliding mode controller for the robust motion control of robot manipulators. IEEE Trans. Control Syst. Technol. 23, 2316–2325 (2015). https://doi.org/10.1109/TCST.2015.2420624

    Article  Google Scholar 

  23. Ostad-Ali-Askari, K., Shayan, M.: Subsurface drain spacing in the unsteady conditions by HYDRUS-3D and artificial neural networks. Arab. J. Geosci. 14, 1936 (2021). https://doi.org/10.1007/s12517-021-08336-0

    Article  Google Scholar 

  24. Lillicrap, T.P., Hunt, J.J., Pritzel, A., Heess, N., Erez, T., Tassa, Y., Silver, D., Wierstra, D.: Continuous control with deep reinforcement learning. arXiv. (2015)

  25. Luo, X., Zhang, Y., He, Z., Yang, G., Ji, Z.: A two-step environment-learning-based method for optimal UAV deployment. IEEE Access. 7, 1–1 (2019). https://doi.org/10.1109/access.2019.2947546

    Article  Google Scholar 

  26. Hu, K., Xu, Z.X., Yang, W., Xu, B.: Build the structure of WFSless AO system through deep reinforcement learning. IEEE Photonics Technol. Lett. 30, 2033–2036 (2018). https://doi.org/10.1109/LPT.2018.2874998

    Article  Google Scholar 

  27. Qiu, C., Hu, Y., Chen, Y., Zeng, B.: Deep deterministic policy gradient (DDPG)-based energy harvesting wireless communications. IEEE Internet Things J. 6, 8577–8588 (2019). https://doi.org/10.1109/JIOT.2019.2921159

    Article  Google Scholar 

  28. Xu, H., Sun, H., Nikovski, D., Kitamura, S., Mori, K., Hashimoto, H.: Deep reinforcement learning for joint bidding and pricing of load serving entity. IEEE Trans. Smart Grid. 10, 1–1 (2019). https://doi.org/10.1109/tsg.2019.2903756

    Article  Google Scholar 

  29. Nguyen, K.K., Duong, T.Q., Vien, N.A., Le-Khac, N.A., Nguyen, L.D.: Distributed deep deterministic policy gradient for power allocation control in D2D-based V2V communications. IEEE Access. 7, 64533–64543 (2019). https://doi.org/10.1109/ACCESS.2019.2952411

    Article  Google Scholar 

  30. Tang, Y., Guo, H., Yuan, T., Gao, X., Hong, X., Li, Y., Qiu, J., Zuo, Y., Wu, J.: Flow splitter: a deep reinforcement learning-based flow scheduler for hybrid optical-electrical data center network. IEEE Access. 7, 129955–129965 (2019). https://doi.org/10.1109/access.2019.2940445

    Article  Google Scholar 

  31. Lobos-Tsunekawa, K., Leiva, F., Ruiz-Del-Solar, J.: Visual navigation for biped humanoid robots using deep reinforcement learning. IEEE Robot. Autom. Lett. 3, 3247–3254 (2018). https://doi.org/10.1109/LRA.2018.2851148

    Article  Google Scholar 

  32. Deshpande, S.: How to train your Cheetah with Deep Reinforcement Learning, https://medium.com/@deshpandeshrinath/how-to-train-your-cheetah-with-deep-reinforcement-learning-14855518f916

  33. Experimental Video, https://youtu.be/GfJqtXotyuI

  34. Mnih, V., Badia, A.P., Mirza, M., Graves, A., Lillicrap, T.P., Harley, T., Silver, D., Kavukcuoglu, K.: Asynchronous methods for deep reinforcement learning. arXiv (2016)

  35. Schulman, J., Wolski, F., Dhariwal, P., Radford, A., Klimov, O.: Proximal policy optimization algorithms. arXiv (2017)

  36. Haarnoja, T., Zhou, A., Abbeel, P., Levine, S.: Soft actor-critic: off-policy maximum entropy deep reinforcement learning with a stochastic actor. arXiv (2018)

  37. Fujimoto, S., van Hoof, H., Meger, D.: Addressing function approximation error in actor-critic methods. arXiv (2018)

  38. Redmon, J., Farhadi, A.: YOLO9000: Better, faster, stronger. arXiv (2016)

  39. Ioffe, S., Szegedy, C.: Batch normalization: accelerating deep network training by reducing internal covariate shift. arXiv (2015)

  40. Cheng, Y., Sun, Z., Huang, Y., Zhang, W.: Fuzzy categorical deep reinforcement learning of a defensive game for an unmanned surface vessel. Int. J. Fuzzy Syst. 21, 592–606 (2019). https://doi.org/10.1007/s40815-018-0586-0

    Article  Google Scholar 

  41. Awheda, M.D., Schwartz, H.M.: A residual gradient fuzzy reinforcement learning algorithm for differential games. Int. J. Fuzzy Syst. 19, 1058–1076 (2017). https://doi.org/10.1007/s40815-016-0284-8

    Article  MathSciNet  Google Scholar 

  42. Hwang, K.-S., Lin, J.-L., Shi, H., Chen, Y.-Y.: Policy learning with human reinforcement. Int. J. Fuzzy Syst. 18, 618–629 (2016). https://doi.org/10.1007/s40815-016-0194-9

    Article  MathSciNet  Google Scholar 

  43. Pan, W., Qu, R., Hwang, K.-S., Lin, H.-S.: An ensemble fuzzy approach for inverse reinforcement learning. Int. J. Fuzzy Syst. 21, 95–103 (2019). https://doi.org/10.1007/s40815-018-0535-y

    Article  MathSciNet  Google Scholar 

Download references

Funding

This work was supported by the Ministry of Science and Technology, Taiwan, under Grants MOST 109–2221-E-194-053-MY3.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ping-Huan Kuo.

Appendix

Appendix

See Tables 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12 and 13.

Table 3 The error of goal A in y-axis
Table 4 The error of goal A in z-axis
Table 5 The error of goal B in x-axis
Table 6 The error of goal B in y-axis
Table 7 The error of goal B in z-axis
Table 8 The error of goal C in x-axis
Table 9 The error of goal C in y-axis
Table 10 The error of goal C in z-axis
Table 11 The error of goal D in terms of MAE
Table 12 The error of goal E in terms of MAE
Table 13 The error of goal F in terms of MAE

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Kuo, PH., Hu, J., Lin, ST. et al. Fuzzy Deep Deterministic Policy Gradient-Based Motion Controller for Humanoid Robot. Int. J. Fuzzy Syst. 24, 2476–2492 (2022). https://doi.org/10.1007/s40815-022-01293-0

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s40815-022-01293-0

Keywords

Navigation