research-article

Actor-Critic Neural Network Based Finite-time Control for Uncertain Robotic Systems

Author:
Changyi Lei

University of Bristol, UK

University of Bristol, UK
View Profile

ICISDM '21: Proceedings of the 2021 5th International Conference on Information System and Data MiningMay 2021Pages 34–40https://doi.org/10.1145/3471287.3471288

Published:25 September 2021Publication History

ICISDM '21: Proceedings of the 2021 5th International Conference on Information System and Data Mining

Pages 34–40

ABSTRACT

This paper investigates reinforcement learning (RL) based finite-time control (FTC) of uncertain robotic systems. The proposed methodology consists of a terminal sliding mode based finite-time controller and an Actor-Critic (AC)-based RL loop that adjusts the output of the neural network. The terminal sliding mode controller is designed to ensure calculable settling time, as compared to conventional asymptotic stability. The AC-based RL loop uses recursive least square technique to update the critic network and policy gradient algorithm to estimate the parameters of actor network. We show that the AC is beneficial to improve robustness of terminal sliding mode controller both in approaching stage and near equilibrium. The performance of proposed controller is compared to that with only terminal sliding mode controller. The simulation results show that proposed controller outperforms pure terminal sliding mode controller, and that AC is a successful supplement to FTC.

References

W. Bai, T. Li, and S. Tong. 2020. NN Reinforcement Learning Adaptive Control for a Class of Nonstrict-Feedback Discrete-Time Systems. IEEE Transactions on Cybernetics 50, 11 (2020), 4573–4584. https://doi.org/10.1109/TCYB.2020.2963849Google ScholarCross Ref
Sanjay P. Bhat and Dennis S. Bernstein. 2000. Finite-Time Stability of Continuous Autonomous Systems. SIAM Journal on Control and Optimization 38, 3 (2000), 751–766.Google ScholarDigital Library
A. Birari, A. Kharat, P. Joshi, R. Pakhare, U. Datar, and V. Khotre. 2016. Velocity control of omni drive robot using PID controller and dual feedback. In 2016 IEEE First International Conference on Control, Measurement and Instrumentation (CMI). 295–299. https://doi.org/10.1109/CMI.2016.7413758Google ScholarCross Ref
Peter Corke. 2013. Robotics, Vision and Control: Fundamental Algorithms in MATLAB (1st ed.). Springer Publishing Company, Incorporated.Google Scholar
X. Feng, Y. Hu, and H. Yin. 2015. The Asymptotic Stability of a System with Two Identical Robots and a Built-In Safety. In 2015 7th International Conference on Intelligent Human-Machine Systems and Cybernetics, Vol. 1. 370–373. https://doi.org/10.1109/IHMSC.2015.40Google ScholarDigital Library
M. Gromniak and J. Stenzel. 2019. Deep Reinforcement Learning for Mobile Robot Navigation. In 2019 4th Asia-Pacific Conference on Intelligent Robot Systems (ACIRS). 68–73. https://doi.org/10.1109/ACIRS.2019.8935944Google Scholar
Y. Guo, P. Wang, G. Ma, and C. Li. 2018. Prescribed Performance Based Finite-Time Attitude Tracking Control for Rigid Spacecraft. In 2018 Eighth International Conference on Information Science and Technology (ICIST). 121–126. https://doi.org/10.1109/ICIST.2018.8426177Google Scholar
Z. Hu, Q. Chen, Y. Hu, and C. Chen. 2018. Barrier Lyapunov Function Based Finite-Time Backstepping Control of Quadrotor with Full State Constraints. In 2018 37th Chinese Control Conference (CCC). 9877–9882. https://doi.org/10.23919/ChiCC.2018.8483906Google Scholar
R. Inoue, K. Watanabe, and H. Igarashi. 2010. Acquiring of walking behavior for four-legged robots using actor-critic method based on policy gradient. In 2010 IEEE International Symposium on Intelligent Control. 795–800. https://doi.org/10.1109/ISIC.2010.5612891Google ScholarCross Ref
K. Ito and F. Matsuno. 2002. A study of reinforcement learning for the robot with many degrees of freedom - acquisition of locomotion patterns for multi-legged robot. In Proceedings 2002 IEEE International Conference on Robotics and Automation (Cat. No.02CH37292), Vol. 4. 3392–3397 vol.4. https://doi.org/10.1109/ROBOT.2002.1014235Google ScholarCross Ref
S. N. Kadam and B. Seth. 2011. LQR controller of one wheel robot stabilized by reaction wheel principle. In 2011 2nd International Conference on Instrumentation Control and Automation. 299–303. https://doi.org/10.1109/ICA.2011.6130176Google ScholarCross Ref
F. Lachekhab and M. Tadjine. 2015. Goal seeking of mobile robot using fuzzy actor critic learning algorithm. In 2015 7th International Conference on Modelling, Identification and Control (ICMIC). 1–6. https://doi.org/10.1109/ICMIC.2015.7409370Google ScholarCross Ref
Q. Lan, S. Li, J. Yang, and L. Guo. 2013. Finite-time control for soft landing on an asteroid based on homogeneous system technique. In Proceedings of the 32nd Chinese Control Conference. 673–678.Google Scholar
Ji Li and Chunjiang Qian. 2006. Global finite-time stabilization by dynamic output feedback for a class of continuous nonlinear systems. IEEE Trans. Autom. Control. 51, 5 (2006), 879–884.Google ScholarCross Ref
D. Liu, X. Yang, D. Wang, and Q. Wei. 2015. Reinforcement-Learning-Based Robust Controller Design for Continuous-Time Uncertain Nonlinear Systems Subject to Input Constraints. IEEE Transactions on Cybernetics 45, 7 (2015), 1372–1385. https://doi.org/10.1109/TCYB.2015.2417170Google ScholarCross Ref
H. Nakamura. 2013. Homogeneous integral finite-time control and its application to robot control. In The SICE Annual Conference 2013. 1884–1889.Google Scholar
A. Ortega–Vidal, F. Salazar–Vasquez, and A. Rojas–Moreno. 2020. A comparison between optimal LQR control and LQR predictive control of a planar robot of 2DOF. In 2020 IEEE XXVII International Conference on Electronics, Electrical Engineering and Computing (INTERCON). 1–4. https://doi.org/10.1109/INTERCON50315.2020.9220263Google ScholarCross Ref
Y. P. Pane, S. P. Nageshrao, and R. Babuška. 2016. Actor-critic reinforcement learning for tracking control in robotics. In 2016 IEEE 55th Conference on Decision and Control (CDC). 5819–5826. https://doi.org/10.1109/CDC.2016.7799164Google ScholarDigital Library
Z. Qin, X. He, and D. Zhang. 2011. Nonsingular and fast convergent terminal sliding mode control of robotic manipulators. In Proceedings of the 30th Chinese Control Conference. 2606–2611.Google Scholar
Y. Vaghei, A. Ghanbari, and S. M. R. S. Noorani. 2014. Actor-critic neural network reinforcement learning for walking control of a 5-link bipedal robot. In 2014 Second RSI/ISM International Conference on Robotics and Mechatronics (ICRoM). 773–778. https://doi.org/10.1109/ICRoM.2014.6990997Google ScholarCross Ref
Ziwei Wang, Zhang Chen, Bin Liang, and Bo Zhang. 2018. A novel adaptive finite time controller for bilateral teleoperation system. Acta Astronautica 144(2018), 263–270.Google ScholarCross Ref
Ziwei Wang, Zhang Chen, Yiman Zhang, Xingyao Yu, Xiang Wang, and Bin Liang. 2019. Adaptive finite-time control for bilateral teleoperation systems with jittering time delays. International Journal of Robust and Nonlinear Control 29, 4 (2019), 1007–1030.Google ScholarCross Ref
Z. Wang, H. Lam, B. Xiao, Z. Chen, B. Liang, and T. Zhang. 2020. Event-Triggered Prescribed-Time Fuzzy Control for Space Teleoperation Systems Subject to Multiple Constraints and Uncertainties. IEEE Transactions on Fuzzy Systems(2020), 1–1. https://doi.org/10.1109/TFUZZ.2020.3007438Google ScholarDigital Library
Z. Wang, B. Liang, Y. Sun, and T. Zhang. 2020. Adaptive Fault-Tolerant Prescribed-Time Control for Teleoperation Systems With Position Error Constraints. IEEE Transactions on Industrial Informatics 16, 7 (2020), 4889–4899.Google ScholarCross Ref
Ziwei Wang, Yu Tian, Yanchao Sun, and Bin Liang. 2020. Finite-time output-feedback control for teleoperation systems subject to mismatched term and state constraints. Journal of the Franklin Institute 357, 16 (2020), 11421–11447.Google ScholarCross Ref
L. Wei and W. Yao. 2015. Design and implement of LQR controller for a self-balancing unicycle robot. In 2015 IEEE International Conference on Information and Automation. 169–173. https://doi.org/10.1109/ICInfA.2015.7279279Google ScholarCross Ref
M. Ye, G. Gao, and J. Zhong. 2020. Finite-Time Lyapunov-based Second-Order Sliding Mode Control for a Parallel Robot for Automobile Electro-Coating Conveying. In 2020 39th Chinese Control Conference (CCC). 3695–3700. https://doi.org/10.23919/CCC50068.2020.9188766Google ScholarCross Ref
Z. Yin, H. Qian, A. Xiao, J. Wu, and G. Liu. 2011. The Application of Adaptive PID Control in the Spray Robot. In 2011 Fourth International Conference on Intelligent Computation Technology and Automation, Vol. 1. 528–531. https://doi.org/10.1109/ICICTA.2011.145Google ScholarDigital Library
X. Zhao, B. Tao, L. Qian, and H. Ding. 2020. Model-based actor-critic learning for optimal tracking control of robots with input saturation. IEEE Transactions on Industrial Electronics(2020), 1–1. https://doi.org/10.1109/TIE.2020.2992003Google Scholar

Recommendations

Actor–critic learning based PID control for robotic manipulators
Abstract
In this paper, we propose a reinforcement learning structure for auto-tuning PID gains by solving an optimal tracking control problem for robot manipulators. Capitalizing on the actor–critic framework implemented by neural networks, we achieve ...
Graphical abstract

Display Omitted
Highlights
- Actor–critic learning-based adaptive PID control is proposed for robot manipulators.
- Auto-tuning of PID gains is provided by solving an optimal tracking control problem.
- The stability of the closed-loop system is guaranteed by ...
Read More
Novel criteria for finite-time stabilization and guaranteed cost control of delayed neural networks

In this paper, the problem of robust finite-time stabilization with guaranteed cost control for a class of delayed neural networks is considered. The time delay is a continuous function belonging to a given interval, but not necessary to be ...
Read More
Finite-Time Control of Nonlinear Impulsive Switched Positive Systems Based on an Event-Triggered Controller
Abstract
In this paper, the finite-time control of nonlinear impulsive switched positive systems (ISPSs) is studied, where the impulses and bounded disturbance are both fully considered. By designing a novel event-triggered strategy, we present LMI-based ... $_{}$
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in

ICISDM '21: Proceedings of the 2021 5th International Conference on Information System and Data Mining
May 2021
162 pages
ISBN:9781450389549
DOI:10.1145/3471287

Copyright © 2021 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 25 September 2021
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
Actor-Critic
finite-time control
neural networks
sliding mode control
Qualifiers
- research-article
- Research
- Refereed limited
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 0
  Total Citations
  View Citations
- 112
  Total Downloads
- Downloads (Last 12 months)34
- Downloads (Last 6 weeks)5
Other Metrics
View Author Metrics
Cited By
This publication has not been cited yet

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format .

View HTML Format

Actor-Critic Neural Network Based Finite-time Control for Uncertain Robotic Systems

ICISDM '21: Proceedings of the 2021 5th International Conference on Information System and Data Mining

ABSTRACT

References

Cited By

Recommendations

Actor–critic learning based PID control for robotic manipulators

Novel criteria for finite-time stabilization and guaranteed cost control of delayed neural networks

Finite-Time Control of Nonlinear Impulsive Switched Positive Systems Based on an Event-Triggered Controller

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

HTML Format

Caption

Actor-Critic Neural Network Based Finite-time Control for Uncertain Robotic Systems

ICISDM '21: Proceedings of the 2021 5th International Conference on Information System and Data Mining

ABSTRACT

References

Cited By

Recommendations

Actor–critic learning based PID control for robotic manipulators

Novel criteria for finite-time stabilization and guaranteed cost control of delayed neural networks

Finite-Time Control of Nonlinear Impulsive Switched Positive Systems Based on an Event-Triggered Controller

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

HTML Format

Share this Publication link

Share on Social Media