An Improved Q-Learning Algorithm for Path Planning in Maze Environments

Gu, Shimin; Mao, Guojun

doi:10.1007/978-3-030-55187-2_40

Shimin Gu¹⁷ &
Guojun Mao¹⁸

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 1251))

Included in the following conference series:

Proceedings of SAI Intelligent Systems Conference

1038 Accesses
4 Citations

Abstract

The path planning is the problem of finding the optimal paths in a given environment, which has become an important way to test the intelligent learning algorithms. In AI-based path planning, the earliest and more in-depth issue is Intelligent Obstacle Avoidance, that is, an agent needs to successfully avoid all obstacles or traps in an unknown environment. Compared with other learning methods, RL (Reinforcement Learning) has inherent advantages in path planning. Unlike most machine learning methods, RL is an unsupervised active learning method. It can not only effectively achieve intelligent obstacle avoidance, but also find the optimal path from unfamiliar environment such as maze through many experiments. Q-learning algorithm is recognized as one of the most typical RL algorithms. Its advantages are simple and practical, but it also has the significant disadvantage of slow convergence speed. This paper gives a called ɛ-Q-Learning algorithm, which is an improvement to the traditional Q-Learning algorithm by using Dynamic Search Factor technology. Experiments show that compared with the existing Q-Learning algorithms, ɛ-Q-Learning can find out a better optimal paths with lower costs of searching.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 169.00; Price excludes VAT (USA)

Softcover Book: USD 219.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Kober, J., Bagnell, J., Peters, J.: Reinforcement learning in robotics: a survey. Int. J. Robot. Res. 32(11), 1238–1274 (2013)
Article Google Scholar
Mohanan, M., Salgoankar, A.: A survey of robotic motion planning in dynamic environments. Robot. Auton. Syst. 100, 171–185 (2018)
Article Google Scholar
Chen, C.-L.: Autonomous Learning and Navigation Control of Mobile Robot Based on Reinforcement Learning. University of Science and Technology of China (2006)
Google Scholar
Sutton, R.S., Barto, A.G.: Reinforcement Learning: An Introduction, 2nd edn. MIT Press, Boston (2018)
Google Scholar
Gao, Y., Chen, S.-F., Lu, X.: Review of reinforcement learning. J. Autom. 1, 88–102 (2004)
Google Scholar
Kaelbling, L.P., Littman, M.L., Moore, A.W.: Reinforcement learning: a survey. J. Artif. Intell. Res. 4, 237–285 (1996)
Article Google Scholar
Liu, Z., Li, H., Liu, Q.: Reinforcement learning algorithm research. Comput. Eng. Des. 9(22), 5805–5809 (2008)
Google Scholar
Silver, D., Maddison, C.J., Guez, A., Sifre, L., et al.: Mastering the game of go with deep neural networks and tree search. Nature 529(7587), 484–489 (2016)
Article Google Scholar
Castronovo, M., Maes, F., Fonteneau, R., et al.: Learning exploration/exploitation strategies for single trajectory reinforcement learning. In: European Workshop on Reinforcement Learning, pp. 1–10 (2013)
Google Scholar
Xu, Y.: Research on Agent Path Planning Based on Reinforcement Learning. Shandong University (2013)
Google Scholar
Santos, M., Martín, H.J.A., López, V., et al.: Dyna-H: a heuristic planning reinforcement learning algorithm applied to role-playing game strategy decision systems. Knowl.-Based Syst. 23(4), 541–550 (2010)
Google Scholar
He, D.-F., Sun, S.-D.: An online self-learning fuzzy navigation method for mobile robots. J. Xi’an Univ. Technol. 27(4), 325–329 (2007)
Google Scholar
Hao, Z.-Z., Fang-Fang, Li-Ping: Three-dimensional track planning algorithm of UAV based on q-learning. J. Shanghai Jiaotong Univ. 46(12), 1931–1935 (2012)
Google Scholar
Mnih, V., Rusu, A., Veness, J., et al.: Human-level control through deep reinforcement learning. Nature 518(7540), 529–533 (2015)
Article Google Scholar
Fakoor, M., Kosari, A., Jafarzadeh, M.: Humanoid robot path planning with fuzzy markov decision processes. J. Appl. Res. Technol. 14(5), 300–310 (2016)
Article Google Scholar
Roozegar, M., et al.: XCS-based reinforcement learning algorithm for motion planning of a spherical mobile robot. Appl. Intell. 45(3), 736–746 (2016)
Article MathSciNet Google Scholar

Download references

Author information

Authors and Affiliations

Fujian University of Technology, Fuzhou, 350118, China
Shimin Gu
Central University of Finance and Economics, Beijing, 100000, China
Guojun Mao

Authors

Shimin Gu
View author publications
You can also search for this author in PubMed Google Scholar
Guojun Mao
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Guojun Mao .

Editor information

Editors and Affiliations

Saga University, Saga, Japan
Kohei Arai
The Science and Information (SAI) Organization, Bradford, West Yorkshire, UK
Supriya Kapoor
The Science and Information (SAI) Organization, Bradford, West Yorkshire, UK
Rahul Bhatia

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Gu, S., Mao, G. (2021). An Improved Q-Learning Algorithm for Path Planning in Maze Environments. In: Arai, K., Kapoor, S., Bhatia, R. (eds) Intelligent Systems and Applications. IntelliSys 2020. Advances in Intelligent Systems and Computing, vol 1251. Springer, Cham. https://doi.org/10.1007/978-3-030-55187-2_40

Download citation

DOI: https://doi.org/10.1007/978-3-030-55187-2_40
Published: 25 August 2020
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-55186-5
Online ISBN: 978-3-030-55187-2
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)

Publish with us

Policies and ethics