Multi-agent reinforcement learning for redundant robot control in task-space

Perrusquía, Adolfo; Yu, Wen; Li, Xiaoou

doi:10.1007/s13042-020-01167-7

Multi-agent reinforcement learning for redundant robot control in task-space

Original Article
Published: 09 July 2020

Volume 12, pages 231–241, (2021)
Cite this article

International Journal of Machine Learning and Cybernetics Aims and scope Submit manuscript

Adolfo Perrusquía¹,
Wen Yu¹ &
Xiaoou Li²

2179 Accesses
42 Citations
Explore all metrics

Abstract

Task-space control needs the inverse kinematics solution or Jacobian matrix for the transformation from task space to joint space. However, they are not always available for redundant robots because there are more joint degrees-of-freedom than Cartesian degrees-of-freedom. Intelligent learning methods, such as neural networks (NN) and reinforcement learning (RL) can learn the inverse kinematics solution. However, NN needs big data and classical RL is not suitable for multi-link robots controlled in task space. In this paper, we propose a fully cooperative multi-agent reinforcement learning (MARL) to solve the kinematic problem of redundant robots. Each joint of the robot is regarded as one agent. The fully cooperative MARL uses a kinematic learning to avoid function approximators and large learning space. The convergence property of the proposed MARL is analyzed. The experimental results show that our MARL is much more better compared with the classic methods such as Jacobian-based methods and neural networks.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Decentralized Multiagent Reinforcement Learning for Efficient Robotic Control by Coordination Graphs

Cooperative deterministic learning control for a group of homogeneous nonlinear uncertain robot manipulators

Article 23 May 2018

Marwan Abdelatti, Chengzhi Yuan, … Cong Wang

Optimal Control for Robot–Environment Interaction in Robotic Systems

Notes

Task-space (or Cartesian space) is defined by the position and orientation of the end effector of a robot. Joint-space is defined by angular displacements of each joint angle of a robot.

References

Ahmadi S, Fateh M (2018) Task-space asymptotic tracking control of robots using a direct adaptive Taylor series controller. J Vib Control 24(23):5570–5584. https://doi.org/10.1177/1077546318758800
Article MathSciNet Google Scholar
Ansari Y, Falotico E (2016) A multiagent reinforcement learning approach for inverse kinematics oh high dimensional manipulators with precision positioning. In: 6th IEEE RAS/EMBS international conference on biomedical robotics and biomechatronics (BioRob). https://doi.org/10.1109/BIOROB.2016.7523669
Atashzar S, Tavakoli M, Patel R (2018) A computational-model-based study of supervised haptics-enabled therapist-in-the-loop training for upper-limb poststroke robotic rehabilitation. IEEE/ASME Trans Mechatron 23(2):562–574. https://doi.org/10.1109/TMECH.2018.2806918
Article Google Scholar
Axinte D, Dong X, Palmer D, Rushworth A, Guzman S, Olarra A (2018) Miror-miniaturized robotic systems for holisticin-siturepair and maintenance works in restrained and hazardous environments. IEEE/ASME Trans Mechatron 23(2):978–981. https://doi.org/10.1109/TMECH.2018.2800285
Article Google Scholar
Bcsi B, Nguyen-Tuong D, Csat L, Schlkopf B, Peters J (2011) Learning inverse kinematics with structured prediction. IEEE/RSJ Int Conf Intell Robots Syst. https://doi.org/10.1109/IROS.2011.6094666
Article Google Scholar
Bitzer S, Howard M, Vijayakumar S (2010) Using dimensionality reduction to exploit constraints in reinforcement learning. IEEE/RSJ Int Conf Intell Robots Syst (IROS). https://doi.org/10.1109/IROS.2010.5650243
Article Google Scholar
Buşoniu L, Babûska R, De Schutter B (2010) Multi-agent reinforcement learning: an overview. In: Srinivasan D, Jain L (eds) Innovations in multi-agent systems and applications—1. Studies in computational intelligence. Lecture notes in computer science, vol 310. Springer, Berlin. https://doi.org/10.1007/978-3-642-14435-6_7
Chapter Google Scholar
Buşoniu L, Babûska R, De Schutter B, Ernst D (2010) Reinforcement learning and dynamic programming using function approximators. Automation and Control Engineering Series. CRC Press, Boca Raton
Google Scholar
Cheah C, Li X (2011) Singularity-robust task-space tracking control of robot. IEEE Int Conf Robot Autom. https://doi.org/10.1109/ICRA.2011.5979932
Article Google Scholar
Csistzar A, Eilers J, Verl A (2017) On solving the inverse kinematics problem using neural networks. In: 24th international conference on mechatronics and machine vision in practice. https://doi.org/10.1109/M2VIP.2017.8211457
Deisenroth M, Rasmussen C (2011) PILCO: A model-based and data-efficient approach to policy search. In: Proceedings of the 28th international conference on machine learning, Bellevue, WAA, USA
Deisenroth MP, Neumann G, Peters J (2011) A survey on policy search for robotics. Found Trends Robot 2(1–2):1–142. https://doi.org/10.1561/2300000021
Article Google Scholar
Duka A (2014) Neural network based inverse kinematics solution for trajectory tracking of a robotic arm. In: Procedia technology, the 7th international conference interdisciplinarity in engineering, INTER-ENG 2013. Petru Maior University of Tirgu Mures, Romania. https://doi.org/10.1016/j.protcy.2013.12.451
Feng Y, Yao-nan W, Yi-min Y (2012) Inverse kinematics solution for robot manipulator based on neural network under joint subspace. Int J Comput Commun Control 7(3):459–472. https://doi.org/10.15837/ijccc.2012.3.1387
Article Google Scholar
Galicki M (2016) Finite-time trajectory tracking control in task space of robotic manipulators. Automatica 67:165–170. https://doi.org/10.1016/j.automatica.2016.01.025
Article MathSciNet MATH Google Scholar
Galicki M (2016) Robust task space trajectory tracking control of robotic manipulators. Int J Appl Mech Eng 21(3):547–568. https://doi.org/10.1515/ijame-2016-0033
Article MATH Google Scholar
Grondman I, Buşoniu L, Babûska R (2012) Model learning actor-critic algorithms: performance evaluation in a motion control task. In: 51st IEEE conference on decision and control (CDC), pp 5272–5277. https://doi.org/10.1109/CDC.2012.6426427
Grondman I, Vaandrager M, Buşoniu L, Babûska R, Schuitema E (2011) Actor-critic control with reference model learning. In: Proceedings of the 18th World congress the international federation of automatic control, pp 14723–14728. https://doi.org/10.3182/20110828-6-IT-1002.00759
Grondman I, Vaandrager M, Buşoniu L, Babûska R, Schuitema E (2012a) Efficient model learning methods for actor-critic control. IEEE Trans Syst Man Cybern B Cybern 42(3):291–602. https://doi.org/10.1109/TSMCB.2011.2170565
Article Google Scholar
Hyatt P (2019) Configuration estimation for accurate position control of large-scale soft robots. IEEE/ASME Trans Mechatron 24(1):88–99. https://doi.org/10.1109/TMECH.2018.2878228
Article Google Scholar
Jaakola TMJ, Singh S (1994) On the convergence of stochastic iterative dyanamic programming algorithms. Neural Comput 6(6):1185–1201. https://doi.org/10.1162/neco.1994.6.6.1185
Article Google Scholar
Kober J, Bagnell J, Peters J (2013) Reinforcement learning in robotics: a survey. Int J Robot Res 32(11):1238–1274. https://doi.org/10.1007/978-3-319-03194-1_2
Article Google Scholar
Lewis F, Vrable D, Vamvoudakis K (2012) Reinforcement learning and feedback control: using natural decision methods to desgin optimal adaptive controllers. IEEE Control Syst Mag 32(6):76–105. https://doi.org/10.1109/MCS.2012.2214134
Article MathSciNet MATH Google Scholar
Luya L, Gruver W, Zhang Q, Yang Z (2001) Kinematic control of redundant robots and the motion optimizability measure. IEEE Trans Syst Man Cybern Part B Cybern 31(1):155–160. https://doi.org/10.1109/3477.907575
Article Google Scholar
Moon Y, Seo J, Choi J (2015) Development of new end-effector for proof-of-concept of fully robotic multichannel biopsy. IEEE/ASME Trans Mechatron 20(6):2996–3008. https://doi.org/10.1109/TMECH.2015.2418793
Article Google Scholar
Patel R, Shadpey F (2005) Control of redundant manipulators: theory and experiments. Springer, Berlin. https://doi.org/10.1007/b93979
Book MATH Google Scholar
Perrusquía A, Yu W (2020) Human-in-the-loop control using euler angles. J Intell Robot Syst 97:271–285. https://doi.org/10.1007/s10846-019-01058-2
Article Google Scholar
Perrusquía A, Yu W (2020) Robot position/force control in unknown environment using hybrid reinforcement learning. Cybern Syst. https://doi.org/10.1080/01969722.2020.1758466
Article Google Scholar
Perrusquía A, Yu W, Soria A (2019) Large space dimension reinforcement learning for robot position/force discrete control. In: 2019 6th international conference on control, decision and information technologies (CoDIT 2019), Paris, France. https://doi.org/10.1109/CoDIT.2019.8820575
Perrusquía A, Yu W, Soria A (2019) Optimal contact force in unknown environments using reinforcement learning and model-free controllers. In: 16th international conference on electrical engineering, computing science and automatic control (CCE), Mexico city, Mexico. https://doi.org/10.1109/ICEEE.2019.8884518
Perrusquía A, Yu W, Soria A (2019) Position/force control of robots manipulators using reinforcement learning. Ind Robot Int J Robot Res Appl 46(2):267–280. https://doi.org/10.1108/IR-10-2018-0209
Article Google Scholar
Perrusquiía A, Yu W (2020) Robust control under worst-case uncertainty for unknown nonlinear systems using modified reinforcement learning. Int J Robust Nonlinear Control 30(7):2920–2936. https://doi.org/10.1002/rnc.4911
Article MathSciNet Google Scholar
Rolf M, Steil J (2014) Efficient exploratory learning of inverse kinematics on a bionic elephant trunk. IEEE Trans Neural Netw Learn Syst 25(6):1147–1160. https://doi.org/10.1109/TNNLS.2013.2287890
Article Google Scholar
Schulman J, Wolski F, Klimov O (2017) Proximal policy optimization algorithms. arXiv:1707.06347
Silver D, Lever G, Hess N, Degris T, Wierstra D, Riedmiller M (2014) Deterministic policy gradient algorithms. In: Proceedings of the 31st international conference on machine learning, Beijing, China, vol 32, pp 387–395
Sun K, Liu L, Qiu J, Feng G (2020) Fuzzy adaptive finite-time fault tolerant control for strict-feedback nonlinear systems. IEEE Trans Fuzzy Syst. https://doi.org/10.1109/TFUZZ.2020.2965890
Article Google Scholar
Sun K, Qiu J, Karimi H, Fu Y (2020) Event- triggered robust fuzzy adaptive finite-time control of nonlinear systems with prescribed performance. IEEE Trans Fuzzy Syst. https://doi.org/10.1109/TFUZZ.2020.2979129
Article Google Scholar
Sutton RAB (1998) Reinforcement learning: an introduction. MIT Press, Cambridge
MATH Google Scholar
Tamei T, Shibata T (2009) Policy gradient learning of cooperative interaction with a robot using user’s biological signals. Int Conf Neural Inf Process (ICONIP). https://doi.org/10.1007/978-3-642-03040-6_125
Article Google Scholar
Theodorou E, Buchli J, Schaal S (2010) Reinforcement learning of motor skills in high dimensions: a path integral approach. IEEE Int Conf Robot Autom (ICRA). https://doi.org/10.1109/ROBOT.2010.5509336
Article MATH Google Scholar
Tuong D, Peters J (2011) Learning task-space tracking control with kernels. IEEE/RSJ Int Conf Intell Robots Syst. https://doi.org/10.1109/IROS.2011.6094428
Article Google Scholar
Wiering MA, van Hasselt H (2007) Two novel on-policy reinforcement learning algorithms based on TD(\(\lambda\))-method. In: Proceedings of the 2007 IEEE symposium on approximate dynamic programming and reinforcement learning (ADPRL). https://doi.org/10.1109/ADPRL.2007.368200
Wiering MA, van Hasselt H (2009) The QV family compared to other reinforcement learning algorithms. In: 2009 IEEE symposium on adaptive dynamic programming and reinforcement learning. https://doi.org/10.1109/ADPRL.2009.4927532
Xian B, de Queiroz M, Dawson D, Walker I (2004) Task-space tracking control of robots manipulators via quaternion feedback. IEEE Trans Robot Autom 20(1):160–167. https://doi.org/10.1109/TRA.2003.820932
Article Google Scholar
Yu W, Perrusquía A (2019) Simplified stable admittance control using end-effector orientations. Int J Soc Robot. https://doi.org/10.1007/s12369-019-00579-y
Article Google Scholar
Zhang D, Wei B (2017) On the development of learning control for robotic manipulators. Robotics. https://doi.org/10.3390/robotics6040023
Article Google Scholar
Zheng Y, Ma J, Wang L (2017) Consensus of hybrid multi-agent systems. IEEE Trans Neural Netw Learn Syst 29(4):1359–1365. https://doi.org/10.1109/TNNLS.2017.2651402
Article Google Scholar
Zhu Y, Li S, Ma J, Zheng Y (2018) Bipartite consensus in networks of agents with antagonistic interactions and quantization. IEEE Trans Circuits Syst II Express Briefs 65(12):2012–2016. https://doi.org/10.1109/TCSII.2018.2811803
Article Google Scholar

Download references

Author information

Authors and Affiliations

Departamento de Control Automático, CINVESTAV-IPN (National Polytechnic Institute), Mexico City, Mexico
Adolfo Perrusquía & Wen Yu
Departamento de Computación, CINVESTAV-IPN (National Polytechnic Institute), Mexico City, Mexico
Xiaoou Li

Authors

Adolfo Perrusquía
View author publications
You can also search for this author in PubMed Google Scholar
Wen Yu
View author publications
You can also search for this author in PubMed Google Scholar
Xiaoou Li
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Wen Yu.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Perrusquía, A., Yu, W. & Li, X. Multi-agent reinforcement learning for redundant robot control in task-space. Int. J. Mach. Learn. & Cyber. 12, 231–241 (2021). https://doi.org/10.1007/s13042-020-01167-7

Download citation

Received: 27 February 2020
Accepted: 29 June 2020
Published: 09 July 2020
Issue Date: January 2021
DOI: https://doi.org/10.1007/s13042-020-01167-7

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Multi-agent reinforcement learning for redundant robot control in task-space

Abstract

Access this article

Similar content being viewed by others

Decentralized Multiagent Reinforcement Learning for Efficient Robotic Control by Coordination Graphs

Cooperative deterministic learning control for a group of homogeneous nonlinear uncertain robot manipulators

Optimal Control for Robot–Environment Interaction in Robotic Systems

Notes

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Abstract

Access this article

Similar content being viewed by others

Decentralized Multiagent Reinforcement Learning for Efficient Robotic Control by Coordination Graphs

Cooperative deterministic learning control for a group of homogeneous nonlinear uncertain robot manipulators

Optimal Control for Robot–Environment Interaction in Robotic Systems

Notes

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation