An effective asynchronous framework for small scale reinforcement learning problems

Ding, Shifei; Zhao, Xingyu; Xu, Xinzheng; Sun, Tongfeng; Jia, Weikuan

doi:10.1007/s10489-019-01501-9

An effective asynchronous framework for small scale reinforcement learning problems

Published: 11 June 2019

Volume 49, pages 4303–4318, (2019)
Cite this article

Applied Intelligence Aims and scope Submit manuscript

Shifei Ding^1,2,
Xingyu Zhao^1,2,
Xinzheng Xu^1,2,
Tongfeng Sun^1,2 &
…
Weikuan Jia³

776 Accesses
15 Citations
Explore all metrics

Abstract

Reinforcement learning is one of the research hotspots in the field of artificial intelligence in recent years. In the past few years, deep reinforcement learning has been widely used to solve various decision-making problems. However, due to the characteristics of neural networks, it is very easy to fall into local minima when facing small scale discrete space path planning problems. Traditional reinforcement learning uses continuous updating of a single agent when algorithm executes, which leads to a slow convergence speed. Although some scholars have done some improvement work to solve these problems, there are still many shortcomings to be overcome. In order to solve the above problems, we proposed a new asynchronous tabular reinforcement learning algorithms framework in this paper, and present four new variants of asynchronous reinforcement learning algorithms. We apply these algorithms on the standard reinforcement learning environments: frozen lake problem, cliff walking problem and windy gridworld problem, and the simulation results show that these methods can solve discrete space path planning problems efficiently and well balance the exploration and exploitation.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A New Asynchronous Architecture for Tabular Reinforcement Learning Algorithms

Asynchronous reinforcement learning algorithms for solving discrete space path planning problems

Article 04 August 2018

Applications of asynchronous deep reinforcement learning based on dynamic updating weights

Article 07 September 2018

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

References

Zhang C, Bi J, Xu S et al (2019) Multi-Imbalance: An open-source software for multi-class imbalance learning. Knowl-Based Syst. https://doi.org/10.1016/j.knosys.2019.03.001
Article Google Scholar
Fujita H, Cimr D (2019) Computer Aided detection for fibrillations and flutters using deep convolutional neural network. Inf Sci 486:231–239
Article Google Scholar
Zhang C, Liu C, Zhang X et al (2017) An up-to-date comparison of state-of-the-art classification algorithms. Expert Syst Appl 82:128–150
Article Google Scholar
Xiao Q, Dai J, Luo J et al (2019) Multi-view manifold regularized learning-based method for prioritizing candidate disease miRNAs. Knowl-Based Syst. https://doi.org/10.1016/j.knosys.2019.03.023
Article Google Scholar
Sutton R, Barto A (1998) Reinforcement learning: An introduction. MIT press, Cambridge
MATH Google Scholar
Silver D, Schrittwieser J, Simonyan K et al (2017) Mastering the game of Go without human knowledge. Nature 550(7676):354–359
Article Google Scholar
Silver D, Huang A, Maddison M et al (2016) Mastering the game of Go with deep neural networks and tree search. Nature 529(7587):484–489
Article Google Scholar
Mnih V, Kavukcuoglu K, Silver D et al (2015) Human-level control through deep reinforcement learning. Nature 518(7540):529–533
Article Google Scholar
Mnih V, Kavukcuoglu K, Silver D et al (2013) Playing atari with deep reinforcement learning. Proceedings of Workshops at the 26th Neural Information Processing Systems, Lake Tahoe, pp 201–220
Google Scholar
Levine S, Pastor P, Krizhevsky A et al (2016) Learning Hand-Eye Coordination for Robotic Grasping with Large-Scale Data Collection. International Symposium on Experimental Robotics. Springer, Cham, 173–184
Google Scholar
Lenz I, Knepper R, Saxena A (2015) Deepmpc: learning deep latent features for model predictive control. In: Proceedings of the Robotics Science and Systems, Rome, pp 201–209
Satija H, Pineau J (2016) Simultaneous machine translation using deep reinforcement learning. Proceedings of the Workshops of International Conference on Machine Learning, New York, pp 110–119
Google Scholar
Guo H (2015) Generating text with deep reinforcement learning. Proceedings of the Workshops of Advances in Neural Information Processing Systems, Montreal, pp 1–9
Google Scholar
Li J, Monroe W, Ritter A et al (2016) Deep reinforcement learning for dialogue generation. Proceedings of the Conference on Empirical Methods in Natural Language Processing, Austin, pp 1192–1202
Google Scholar
Caicedo J, Lazebnik S (2015) Active Object Localization with Deep Reinforcement Learning. IEEE International Conference on Computer Vision. IEEE, 2488–2496
Sutton R (1988) Learning to predict by the methods of temporal differences. Mach Learn 3(1):9–44
Google Scholar
Watkins C (1989) Learning from delayed rewards. King's College, Cambridge
Rummery G, Niranjan M (1994) On-line Q-learning using connectionist systems. University of Cambridge, Department of Engineering, Cambridge
Google Scholar
Singh S, Sutton R (1996) Reinforcement learning with replacing eligibility traces. Recent Advances in Reinforcement Learning, 123–158
Tsitsiklis J (1994) Asynchronous stochastic approximation and Q-learning. Mach Learn 16(3):185–202
MATH Google Scholar
Mnih V, Badia A, Mirza M et al (2016) Asynchronous methods for deep reinforcement learning. International Conference on Machine Learning, 1928–1937
Zhao X, Ding S, An Y et al (2018) Asynchronous Reinforcement Learning Algorithms for Solving Discrete Space Path Planning Problems. Appl Intell 48(12):4889–4904
Article Google Scholar
Zhao X, Ding S, An Y et al (2019) Applications of asynchronous deep reinforcement learning based on dynamic updating weights. Appl Intell 49(2):581–591
Article Google Scholar
Zhao X, Ding S, An Y (2018) A new asynchronous architecture for tabular reinforcement learning algorithms. Proceedings of the Eighth International Conference on Extreme Learning Machines, 172–180
Nair A, Srinivasan P, Blackwell S et al (2015) Massively parallel methods for deep reinforcement learning. arXiv preprint arXiv:1507.04296
van Hasselt H (2010) Double Q-learning. Adv Neural Inf Proces Syst 23:2613–2621
Google Scholar
Wang Y-H, Li T-H, Lin C-J (2013) Backward Q-learning: The combination of Sarsa algorithm and Q-learning. Eng Appl Artif Intell 26(9):2184–2193
Article Google Scholar
Tokic M, Palm G (2011) Value-difference based exploration: adaptive control between epsilon-greedy and softmax. Annual Conference on Artificial Intelligence, Berlin, pp 335–346
Google Scholar

Download references

Acknowledgements

This work is supported by the Fundamental Research Funds for the Central Universities (No.2017XKZD03).

Author information

Authors and Affiliations

School of Computer Science and Technology, China University of Mining and Technology, Xuzhou, 221116, China
Shifei Ding, Xingyu Zhao, Xinzheng Xu & Tongfeng Sun
Mine DigitizationEngineering Research Center of Minstry of Education of the People′s Republic of China, Xuzhou, 221116, China
Shifei Ding, Xingyu Zhao, Xinzheng Xu & Tongfeng Sun
School of Information Science and Engineering, Shandong Normal University, Jinan, 265000, China
Weikuan Jia

Authors

Shifei Ding
View author publications
You can also search for this author inPubMed Google Scholar
Xingyu Zhao
View author publications
You can also search for this author inPubMed Google Scholar
Xinzheng Xu
View author publications
You can also search for this author inPubMed Google Scholar
Tongfeng Sun
View author publications
You can also search for this author inPubMed Google Scholar
Weikuan Jia
View author publications
You can also search for this author inPubMed Google Scholar

Corresponding author

Correspondence to Shifei Ding.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Ding, S., Zhao, X., Xu, X. et al. An effective asynchronous framework for small scale reinforcement learning problems. Appl Intell 49, 4303–4318 (2019). https://doi.org/10.1007/s10489-019-01501-9

Download citation

Published: 11 June 2019
Issue Date: December 2019
DOI: https://doi.org/10.1007/s10489-019-01501-9

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

An effective asynchronous framework for small scale reinforcement learning problems

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

A New Asynchronous Architecture for Tabular Reinforcement Learning Algorithms

Asynchronous reinforcement learning algorithms for solving discrete space path planning problems

Applications of asynchronous deep reinforcement learning based on dynamic updating weights

Explore related subjects

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now