ABSTRACT
This paper investigates reinforcement learning (RL) based finite-time control (FTC) of uncertain robotic systems. The proposed methodology consists of a terminal sliding mode based finite-time controller and an Actor-Critic (AC)-based RL loop that adjusts the output of the neural network. The terminal sliding mode controller is designed to ensure calculable settling time, as compared to conventional asymptotic stability. The AC-based RL loop uses recursive least square technique to update the critic network and policy gradient algorithm to estimate the parameters of actor network. We show that the AC is beneficial to improve robustness of terminal sliding mode controller both in approaching stage and near equilibrium. The performance of proposed controller is compared to that with only terminal sliding mode controller. The simulation results show that proposed controller outperforms pure terminal sliding mode controller, and that AC is a successful supplement to FTC.
- W. Bai, T. Li, and S. Tong. 2020. NN Reinforcement Learning Adaptive Control for a Class of Nonstrict-Feedback Discrete-Time Systems. IEEE Transactions on Cybernetics 50, 11 (2020), 4573–4584. https://doi.org/10.1109/TCYB.2020.2963849Google ScholarCross Ref
- Sanjay P. Bhat and Dennis S. Bernstein. 2000. Finite-Time Stability of Continuous Autonomous Systems. SIAM Journal on Control and Optimization 38, 3 (2000), 751–766.Google ScholarDigital Library
- A. Birari, A. Kharat, P. Joshi, R. Pakhare, U. Datar, and V. Khotre. 2016. Velocity control of omni drive robot using PID controller and dual feedback. In 2016 IEEE First International Conference on Control, Measurement and Instrumentation (CMI). 295–299. https://doi.org/10.1109/CMI.2016.7413758Google ScholarCross Ref
- Peter Corke. 2013. Robotics, Vision and Control: Fundamental Algorithms in MATLAB (1st ed.). Springer Publishing Company, Incorporated.Google Scholar
- X. Feng, Y. Hu, and H. Yin. 2015. The Asymptotic Stability of a System with Two Identical Robots and a Built-In Safety. In 2015 7th International Conference on Intelligent Human-Machine Systems and Cybernetics, Vol. 1. 370–373. https://doi.org/10.1109/IHMSC.2015.40Google ScholarDigital Library
- M. Gromniak and J. Stenzel. 2019. Deep Reinforcement Learning for Mobile Robot Navigation. In 2019 4th Asia-Pacific Conference on Intelligent Robot Systems (ACIRS). 68–73. https://doi.org/10.1109/ACIRS.2019.8935944Google Scholar
- Y. Guo, P. Wang, G. Ma, and C. Li. 2018. Prescribed Performance Based Finite-Time Attitude Tracking Control for Rigid Spacecraft. In 2018 Eighth International Conference on Information Science and Technology (ICIST). 121–126. https://doi.org/10.1109/ICIST.2018.8426177Google Scholar
- Z. Hu, Q. Chen, Y. Hu, and C. Chen. 2018. Barrier Lyapunov Function Based Finite-Time Backstepping Control of Quadrotor with Full State Constraints. In 2018 37th Chinese Control Conference (CCC). 9877–9882. https://doi.org/10.23919/ChiCC.2018.8483906Google Scholar
- R. Inoue, K. Watanabe, and H. Igarashi. 2010. Acquiring of walking behavior for four-legged robots using actor-critic method based on policy gradient. In 2010 IEEE International Symposium on Intelligent Control. 795–800. https://doi.org/10.1109/ISIC.2010.5612891Google ScholarCross Ref
- K. Ito and F. Matsuno. 2002. A study of reinforcement learning for the robot with many degrees of freedom - acquisition of locomotion patterns for multi-legged robot. In Proceedings 2002 IEEE International Conference on Robotics and Automation (Cat. No.02CH37292), Vol. 4. 3392–3397 vol.4. https://doi.org/10.1109/ROBOT.2002.1014235Google ScholarCross Ref
- S. N. Kadam and B. Seth. 2011. LQR controller of one wheel robot stabilized by reaction wheel principle. In 2011 2nd International Conference on Instrumentation Control and Automation. 299–303. https://doi.org/10.1109/ICA.2011.6130176Google ScholarCross Ref
- F. Lachekhab and M. Tadjine. 2015. Goal seeking of mobile robot using fuzzy actor critic learning algorithm. In 2015 7th International Conference on Modelling, Identification and Control (ICMIC). 1–6. https://doi.org/10.1109/ICMIC.2015.7409370Google ScholarCross Ref
- Q. Lan, S. Li, J. Yang, and L. Guo. 2013. Finite-time control for soft landing on an asteroid based on homogeneous system technique. In Proceedings of the 32nd Chinese Control Conference. 673–678.Google Scholar
- Ji Li and Chunjiang Qian. 2006. Global finite-time stabilization by dynamic output feedback for a class of continuous nonlinear systems. IEEE Trans. Autom. Control. 51, 5 (2006), 879–884.Google ScholarCross Ref
- D. Liu, X. Yang, D. Wang, and Q. Wei. 2015. Reinforcement-Learning-Based Robust Controller Design for Continuous-Time Uncertain Nonlinear Systems Subject to Input Constraints. IEEE Transactions on Cybernetics 45, 7 (2015), 1372–1385. https://doi.org/10.1109/TCYB.2015.2417170Google ScholarCross Ref
- H. Nakamura. 2013. Homogeneous integral finite-time control and its application to robot control. In The SICE Annual Conference 2013. 1884–1889.Google Scholar
- A. Ortega–Vidal, F. Salazar–Vasquez, and A. Rojas–Moreno. 2020. A comparison between optimal LQR control and LQR predictive control of a planar robot of 2DOF. In 2020 IEEE XXVII International Conference on Electronics, Electrical Engineering and Computing (INTERCON). 1–4. https://doi.org/10.1109/INTERCON50315.2020.9220263Google ScholarCross Ref
- Y. P. Pane, S. P. Nageshrao, and R. Babuška. 2016. Actor-critic reinforcement learning for tracking control in robotics. In 2016 IEEE 55th Conference on Decision and Control (CDC). 5819–5826. https://doi.org/10.1109/CDC.2016.7799164Google ScholarDigital Library
- Z. Qin, X. He, and D. Zhang. 2011. Nonsingular and fast convergent terminal sliding mode control of robotic manipulators. In Proceedings of the 30th Chinese Control Conference. 2606–2611.Google Scholar
- Y. Vaghei, A. Ghanbari, and S. M. R. S. Noorani. 2014. Actor-critic neural network reinforcement learning for walking control of a 5-link bipedal robot. In 2014 Second RSI/ISM International Conference on Robotics and Mechatronics (ICRoM). 773–778. https://doi.org/10.1109/ICRoM.2014.6990997Google ScholarCross Ref
- Ziwei Wang, Zhang Chen, Bin Liang, and Bo Zhang. 2018. A novel adaptive finite time controller for bilateral teleoperation system. Acta Astronautica 144(2018), 263–270.Google ScholarCross Ref
- Ziwei Wang, Zhang Chen, Yiman Zhang, Xingyao Yu, Xiang Wang, and Bin Liang. 2019. Adaptive finite-time control for bilateral teleoperation systems with jittering time delays. International Journal of Robust and Nonlinear Control 29, 4 (2019), 1007–1030.Google ScholarCross Ref
- Z. Wang, H. Lam, B. Xiao, Z. Chen, B. Liang, and T. Zhang. 2020. Event-Triggered Prescribed-Time Fuzzy Control for Space Teleoperation Systems Subject to Multiple Constraints and Uncertainties. IEEE Transactions on Fuzzy Systems(2020), 1–1. https://doi.org/10.1109/TFUZZ.2020.3007438Google ScholarDigital Library
- Z. Wang, B. Liang, Y. Sun, and T. Zhang. 2020. Adaptive Fault-Tolerant Prescribed-Time Control for Teleoperation Systems With Position Error Constraints. IEEE Transactions on Industrial Informatics 16, 7 (2020), 4889–4899.Google ScholarCross Ref
- Ziwei Wang, Yu Tian, Yanchao Sun, and Bin Liang. 2020. Finite-time output-feedback control for teleoperation systems subject to mismatched term and state constraints. Journal of the Franklin Institute 357, 16 (2020), 11421–11447.Google ScholarCross Ref
- L. Wei and W. Yao. 2015. Design and implement of LQR controller for a self-balancing unicycle robot. In 2015 IEEE International Conference on Information and Automation. 169–173. https://doi.org/10.1109/ICInfA.2015.7279279Google ScholarCross Ref
- M. Ye, G. Gao, and J. Zhong. 2020. Finite-Time Lyapunov-based Second-Order Sliding Mode Control for a Parallel Robot for Automobile Electro-Coating Conveying. In 2020 39th Chinese Control Conference (CCC). 3695–3700. https://doi.org/10.23919/CCC50068.2020.9188766Google ScholarCross Ref
- Z. Yin, H. Qian, A. Xiao, J. Wu, and G. Liu. 2011. The Application of Adaptive PID Control in the Spray Robot. In 2011 Fourth International Conference on Intelligent Computation Technology and Automation, Vol. 1. 528–531. https://doi.org/10.1109/ICICTA.2011.145Google ScholarDigital Library
- X. Zhao, B. Tao, L. Qian, and H. Ding. 2020. Model-based actor-critic learning for optimal tracking control of robots with input saturation. IEEE Transactions on Industrial Electronics(2020), 1–1. https://doi.org/10.1109/TIE.2020.2992003Google Scholar
Recommendations
Actor–critic learning based PID control for robotic manipulators
AbstractIn this paper, we propose a reinforcement learning structure for auto-tuning PID gains by solving an optimal tracking control problem for robot manipulators. Capitalizing on the actor–critic framework implemented by neural networks, we achieve ...
Graphical abstractDisplay Omitted
Highlights- Actor–critic learning-based adaptive PID control is proposed for robot manipulators.
- Auto-tuning of PID gains is provided by solving an optimal tracking control problem.
- The stability of the closed-loop system is guaranteed by ...
Novel criteria for finite-time stabilization and guaranteed cost control of delayed neural networks
In this paper, the problem of robust finite-time stabilization with guaranteed cost control for a class of delayed neural networks is considered. The time delay is a continuous function belonging to a given interval, but not necessary to be ...
Finite-Time Control of Nonlinear Impulsive Switched Positive Systems Based on an Event-Triggered Controller
AbstractIn this paper, the finite-time control of nonlinear impulsive switched positive systems (ISPSs) is studied, where the impulses and bounded disturbance are both fully considered. By designing a novel event-triggered strategy, we present LMI-based ...
Comments