A hierarchical reinforcement learning approach for optimal path tracking of wheeled mobile robots

Zuo, Lei; Xu, Xin; Liu, Chunming; Huang, Zhenhua

doi:10.1007/s00521-012-1243-4

A hierarchical reinforcement learning approach for optimal path tracking of wheeled mobile robots

ISNN2012
Published: 25 December 2012

Volume 23, pages 1873–1883, (2013)
Cite this article

Neural Computing and Applications Aims and scope Submit manuscript

Lei Zuo¹,
Xin Xu¹,
Chunming Liu¹ &
…
Zhenhua Huang¹

1083 Accesses
Explore all metrics

Abstract

Robust motion control is fundamental to autonomous mobile robots. In the past few years, reinforcement learning (RL) has attracted considerable attention in the feedback control of wheeled mobile robot. However, it is still difficult for RL to solve problems with large or continuous state spaces, which is common in robotics. To improve the generalization ability of RL, this paper presents a novel hierarchical RL approach for optimal path tracking of wheeled mobile robots. In the proposed approach, a graph Laplacian-based hierarchical approximate policy iteration (GHAPI) algorithm is developed, in which the basis functions are constructed automatically using the graph Laplacian operator. In GHAPI, the state space of an Markov decision process is divided into several subspaces and approximate policy iteration is carried out on each subspace. Then, a near-optimal path-tracking control strategy can be obtained by GHAPI combined with proportional-derivative (PD) control. The performance of the proposed approach is evaluated by using a P3-AT wheeled mobile robot. It is demonstrated that the GHAPI-based PD control can obtain better near-optimal control policies than previous approaches.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A Model-Based Reinforcement Learning Approach to Time-Optimal Control Problems

Learning High-Level Navigation Strategies via Inverse Reinforcement Learning: A Comparative Analysis

Approximate Bayesian reinforcement learning based on estimation of plant

Article 06 February 2020

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

References

Oh SY, Lee JH et al (2000) A new reinforcement learning vehicle control architecture for vision-based road following [J]. IEEE Trans Veh Technol 49(3):997–1005
Article Google Scholar
Yamaguchi T, Sato E et al (2003) Intelligent space and human centered robotics [J]. IEEE Trans Ind Electron 50(5):881–889
Article Google Scholar
Lee JM, Son K et al (2003) Localization of a mobile robot using the image of a moving object [J]. IEEE Trans Ind Electron 50(3):612–619
Article Google Scholar
Lee TC, Tsai CY et al (2004) Fast parking control of mobile robots: a motion planning approach with experimental validation [J]. IEEE Trans Control Syst Technol 12(5):661–676
Article Google Scholar
Palacin J, Salse JA et al (2004) Building a mobile robot for a floor-cleaning operation in domestic environments [J]. IEEE Trans Instrum Meas 53(5):1418–1424
Article Google Scholar
Ding D, Cooper RA (2005) Electric-powered wheelchairs: a review of current technology and insight into future direction [J]. IEEE Control Syst Mag 25(2):22–34
Article Google Scholar
Shim HS, Sung YG (2004) Stability and four-posture control for nonholonomic mobile robots [J]. IEEE Trans Robot Autom 20(1):148–154
Article Google Scholar
Zhao DB, Deng XY, Yi JQ (2009) Motion and internal force control for omni-directional wheeled mobile robots [J]. IEEE ASME Trans Mechatron 14(3):382–387
Article Google Scholar
Wu Y, Wang B et al (2005) Finite-time tracking controller design for nonholonomic systems with extended chained form[J]. IEEE Trans Circuit Syst II Exp Briefs 52(11):798–802
Article Google Scholar
Antonelli G, Chiaverini S et al (2007) A fuzzy-logic-based approach for mobile robot path tracking[J]. IEEE Trans Fuzzy Syst 15(2):211–221
Article Google Scholar
Raffo GV, Gomes GK et al (2009) A predictive controller for autonomous vehicle path tracking[J]. IEEE Trans Intell Transp Syst 10(1):92–102
Article Google Scholar
Wai R, Liu C (2009) Design of dynamic petri recurrent fuzzy neural network and its application to path-tracking control of nonholonomic mobile robot[J]. IEEE Trans Ind Electron 56(7):2667–2683
Article Google Scholar
Mohareri O, Dhaouadi R et al (2012) Indirect adaptive tracking control of a nonholonomic mobile robot via neural networks[J]. Neurocomputing 88:54–66
Article Google Scholar
Aguiar AP, Hespanha JP (2007) Trajectory-tracking and path-following of underactuated autonomous vehicles with parametric modeling uncertainty[J]. IEEE Trans Autom Cont 52(8):1362–1379
Article MathSciNet Google Scholar
Xu D, Zhao DB, Yi JQ, Tan XM (2009) Trajectory tracking control of omnidirectional wheeled mobile manipulators: robust neural network based sliding mode approach [J]. IEEE Trans Syst Man Cybern Part B 39(3):788–799
Article Google Scholar
Park BS, Yoo SJ et al (2010) A simple adaptive control approach for trajectory tracking of electrically driven nonholonomic mobile robots[J]. IEEE Trans Control Syst Technol 18(5):1199–1206
Article Google Scholar
Sutton R, Barto AG (1998) Reinforcement learning: an introduction[M]. The MIT Press, Cambridge
Google Scholar
Zhang Q, Li M, Wang XS, Zhang Y (2012) Reinforcement learning in robot path optimization [J]. J Softw 7(3):657–662
Google Scholar
Zhang PC, Xu X, Liu C, Yuan Q (2009) Reinforcement learning control of a real mobile robot using approximate policy iteration [C]. ISNN 2009, Part III, Lecture Notes in Computer Science, LNCS 5553, pp 278–288
Yen GG, Hickey TW (2004) Reinforcement learning algorithms for robotic navigation in dynamic environments. ISA Trans 43:217–230
Article Google Scholar
Wang FY, Zhang H, Liu D (2009) Adaptive dynamic programming: an introduction [J]. IEEE Comput Intell Mag 4(2):39–47
Article Google Scholar
Sutton R (1996) Generalization in reinforcement learning: successful examples using sparse coarse coding[C]. In: Advances in Neural Information Processing Systems 8 (Proceedings of the 1995 conference). MIT Press, pp 1038–1044
Xu X, He H et al (2002) Efficient reinforcement learning using recursive least-squares methods[J]. J Art Intell Res 16:259–292
MathSciNet MATH Google Scholar
Lagoudakis MG, Parr R (2003) Least-squares policy Iteration[J]. J Mach Learn Res 4:1107–1149
MathSciNet Google Scholar
Liu D, Javaherian H, Kovalenko O, Huang T (2008) Adaptive critic learning techniques for engine torque and air-fuel ratio control [J]. IEEE Trans Syst Man Cybern Part B Cybern 38(4):988–993
Article Google Scholar
Zhang H, Luo Y, Liu D (2009) Neural-network-based near-optimal control for a class of discrete-time affine nonlinear systems with control constraints. IEEE Trans Neural Netw 20(9):1490–1503
Article Google Scholar
Zhang HG, Wei QL, Liu D (2011) An iterative approximate dynamic programming method to solve for a class of nonlinear zero-sum differential games. Automatica 47(1):207–214
Article MathSciNet MATH Google Scholar
Yang Q, Jagannathan S (2007) Online reinforcement learning neural network controller design for nanomanipulation. In: Proceedings of IEEE symposium on approximate dynamic programming and reinforcement learning. Honolulu, HI, pp 225–232
Xu X, Hu DW et al (2007) Kernel-based least squares policy iteration for reinforcement learning[J]. IEEE Trans Neural Netw 18(4):973–992
Article Google Scholar
Bernhard H (2003) Discovering hierarchy in reinforcement learning[D]. University of New South Wales, Australia
Google Scholar
Doina SRSP et al (1999) Between MDPs and semi-MDPs: a framework for temporal abstraction in reinforcement learning[J]. Artif Intell 112:181–211
Article MATH Google Scholar
Parr R (1998) Hierarchical control and learning for Markov decision processes[D]. University of California, Berkeley, California
Google Scholar
Dietterich TG (2000) Hierarchical reinforcement learning with the MAXQ value function decomposition[J]. J Art Intell Res 13:227–303
MathSciNet MATH Google Scholar
Xu X, Liu C et al (2011) Hierarchical approximate policy iteration with binary-tree state space decomposition[J]. IEEE Trans Neural Netw 22(12):1863–1877
Article Google Scholar
Bertsekas DP, Tsitsiklis JN (1996) Neuro-dynamic programming. Athena Scientific, Belmont, Massachusetts
MATH Google Scholar
Vapnik V (1998) Statistical learning theory[M]. John Wiley and Sons, Inc., New York
Google Scholar
Mahadevan S, Maggioni M (2007) Proto-value functions: a Laplacian framework for learning representation and control in Markov decision processes[J]. J Mach Learn Res 8:2169–2231
MathSciNet MATH Google Scholar
Mahadevan S (2008) Learning representation and control in Markov decision processes: new Frontiers[J]. Found Trends Mach Learn 1(4):403–565
Article MathSciNet Google Scholar
Normey-Rico JE, Alcalab I et al (2001) Mobile robot path tracking using a robust PID controller[J]. Control Eng Pract 9:1209–1214
Article Google Scholar
Mahadevan S, Maggioni M (2006) Value function approximation with diffusion wavelets and Laplacian eigenfunctions[C]. In: Proceedings of the neural information processing systems (NIPS). MIT Press
Munos R (2003) Error bounds for approximate policy iteration. In: Proceedings of the 20th annual international conference machine learning. p 560

Download references

Acknowledgments

This paper is supported by National Natural Science Foundation of China under Grant 61075072, & 90820302, the Program for New Century Excellent Talents in University under Grant NCET-10-0901.

Author information

Authors and Affiliations

College of Mechatronics and Automation, National University of Defense Technology, Changsha, 410073, People’s Republic of China
Lei Zuo, Xin Xu, Chunming Liu & Zhenhua Huang

Authors

Lei Zuo
View author publications
You can also search for this author inPubMed Google Scholar
Xin Xu
View author publications
You can also search for this author inPubMed Google Scholar
Chunming Liu
View author publications
You can also search for this author inPubMed Google Scholar
Zhenhua Huang
View author publications
You can also search for this author inPubMed Google Scholar

Corresponding author

Correspondence to Xin Xu.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Zuo, L., Xu, X., Liu, C. et al. A hierarchical reinforcement learning approach for optimal path tracking of wheeled mobile robots. Neural Comput & Applic 23, 1873–1883 (2013). https://doi.org/10.1007/s00521-012-1243-4

Download citation

Received: 13 May 2012
Accepted: 17 October 2012
Published: 25 December 2012
Issue Date: December 2013
DOI: https://doi.org/10.1007/s00521-012-1243-4

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A hierarchical reinforcement learning approach for optimal path tracking of wheeled mobile robots

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

A Model-Based Reinforcement Learning Approach to Time-Optimal Control Problems

Learning High-Level Navigation Strategies via Inverse Reinforcement Learning: A Comparative Analysis

Approximate Bayesian reinforcement learning based on estimation of plant

Explore related subjects

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now