Accelerated deep reinforcement learning with efficient demonstration utilization techniques

Yeo, Sangho; Oh, Sangyoon; Lee, Minsu

doi:10.1007/s11280-019-00763-0

Accelerated deep reinforcement learning with efficient demonstration utilization techniques

Published: 11 February 2020

Volume 24, pages 1275–1297, (2021)
Cite this article

World Wide Web Aims and scope Submit manuscript

846 Accesses
1 Citation
4 Altmetric
Explore all metrics

Abstract

The use of demonstrations for deep reinforcement learning (RL) agents usually accelerates their training process as well as guides the agents to learn complicated policies. Most of the current deep RL approaches with demonstrations assume that there is a sufficient amount of high-quality demonstrations. However, for most real-world learning cases, the available demonstrations are often limited in terms of amount and quality. In this paper, we present an accelerated deep RL approach with dual replay buffer management and dynamic frame skipping on demonstrations. The dual replay buffer manager manages a human replay buffer and an actor replay buffer with independent sampling policies. We also propose dynamic frame skipping on demonstrations called DFS-ER (Dynamic Frame Skipping-Experience Replay) that learns the action repetition factor of the demonstrations. By implementing DFS-ER, we can accelerate deep RL by improving the efficiency of demonstration utilization, thereby yielding a faster exploration of the environment. We verified the training acceleration in three dense reward environments and one sparse reward environment compared to the conventional approach. In our evaluation using the Atari game environments, the proposed approach showed 21.7%-39.1% reduction in training iterations in a sparse reward environment.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Deep learning: systematic review, models, challenges, and research directions

Article Open access 07 September 2023

Multi-agent deep reinforcement learning: a survey

Article Open access 15 April 2021

Applications of game theory in deep learning: a survey

Article 09 February 2022

References

Bellemare, M.G., et al.: The arcade learning environment: an evaluation platform for general agents. J. Artif. Intell. Res. 47, 253–279 (2013)
Article Google Scholar
Brockman, G., et al.: Openai gym. arXiv:1606.01540 (2016)
Dhariwal, P., et al.: OpenAI Baselines: high-quality implementations of reinforcement learning algorithms. https://github.com/openai/baselines (2019). Accessed 25 Apr 2019
Espeholt, L., et al.: Impala: Scalable distributed deep-rl with importance weighted actor-learner architectures. arXiv:1802.01561 (2018)
Gao, Y., et al.: Reinforcement learning from imperfect demonstrations. arXiv:1802.05313 (2018)
Garmulewicz, M., Michalewski, H., Miłoś, P.: Expert-augmented actor-critic for vizdoom and montezumas revenge. arXiv:1809.03447 (2018)
Gu, S., Holly, E., Lillicrap, T., Levine, S.: Deep reinforcement learning for robotic manipulation with asynchronous off-policy updates. In: 2017 IEEE International Conference on Robotics and Automation (ICRA), pp. 3389–3396 (2017)
Hester, T., et al.: Deep q-learning from demonstrations. In: Thirty-Second AAAI Conference on Artificial Intelligence (2018)
Horgan, D., et al.: Distributed prioritized experience replay. arXiv:1803.00933 (2018)
Kurin, V., et al.: The atari grand challenge dataset. arXiv:1705.10998 (2017)
Lakshminarayanan, A.S., Sharma, S., Ravindran, B.: Dynamic frame skip deep q network. arXiv:1605.05365 (2016)
Lillicrap, T.P., et al.: Continuous control with deep reinforcement learning. arXiv :1509.02971 (2015)
Mnih, V., et al.: Playing atari with deep reinforcement learning. arXiv:1312.5602 (2013)
Mnih, V., et al.: Asynchronous methods for deep reinforcement learning. In: International Conference on Machine Learning, pp. 1928–1937 (2016)
Mnih, V., et al.: Human-level control through deep reinforcement learning. Nature. 518(7540), 529 (2015)
Article Google Scholar
Ng, A.Y., et al.: Feature selection, L1 vs. L2 regularization, and rotational variance. In: Proceedings of the Twenty-First International Conference on Machine Learning, p. 78 (2004)
OpenAI Authors: OpenAI Five. https://openai.com/blog/openai-five/ (2018). Accessed 25 Apr 2019
Peng, J., et al.: Incremental Multi step Q-learning. Machine Learning Proceedings 1994, pp 226–232 (1994)
Perez, L., Wang, J.: The effectiveness of data augmentation in image classification using deep learning. arXiv:1712.04621 (2017)
Pohlen, T., et al.: Observe and look further: achieving consistent performance on atari. arXiv:1805.11593 (2018)
Salimans, T., Chen, R.: Learning Montezuma’s revenge from a single demonstration. arXiv:1812.03381 (2018)
Sallab, A.E.L., et al.: Deep reinforcement learning framework for autonomous driving. Electron. Imag. 2017(19), 70–76 (2017)
Article Google Scholar
Schulman, J., et al.: Trust region policy optimization. In: International Conference on Machine Learning, pp. 1889–1897 (2015)
Sharma, S., Lakshminarayanan, A.S., Ravindran, B.: Learning to repeat: Fine grained action repetition for deep reinforcement learning. arXiv:1702.06054 (2017)
Silver, D., et al.: Mastering the game of Go with deep neural networks and tree search. Nature. 529(7587), 484 (2016)
Article Google Scholar
Stadie, B.C., Abbeel, P., Sutskever, I..: Third-person imitation learning. arXiv:1703.01703 (2017)
Stooke, A., Abbeel, P.: Accelerated methods for deep reinforcement learning. arXiv:1803.02811 (2018)
TensorFlow Authors: tensorflow/tensorflow. An Open Source Machine Learning Framework for Everyone. https://github.com/tensorflow/tensorflow (2019). Accessed 25 Apr 2019
Yeo, S., Oh, S., Lee, M.: Accelerating deep reinforcement learning using human demonstration data based on replay buffer management and online frame skipping. In: 2019 IEEE International Conference on Big Data and Smart Computing (BigComp), pp. 1–8 (2019)
Zhang, R., et al.: Atari-HEAD: Atari human eye-tracking and demonstration dataset. arXiv:1903.06754 (2019)

Download references

Acknowledgements

This research was supported by Basic Science Research Program through the National Research Foundation of Korea (NRF) funded by the Ministry of Education (NRF-2018R1D1A1B07043858, 2018R1D1A1B07049923), the supercomputing department at KISTI (Korea Institute of Science and Technology Information) (K-19-L02-C07-S01), and Technology Innovation Program (P0006720) funded by MOTIE, Korea

Author information

Authors and Affiliations

Department of Computer Engineering, Ajou University, Suwon, Republic of Korea
Sangho Yeo & Sangyoon Oh
Institute of Computer Technology, Seoul National University, Seoul, Republic of Korea
Minsu Lee

Authors

Sangho Yeo
View author publications
You can also search for this author in PubMed Google Scholar
Sangyoon Oh
View author publications
You can also search for this author in PubMed Google Scholar
Minsu Lee
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Minsu Lee.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

This article belongs to the Topical Collection: Artificial Intelligence and Big Data Computing

Guest Editors: Wookey Lee and Hiroyuki Kitagawa

Electronic supplementary material

ESM 1

(DOCX 24.9 kb)

Rights and permissions

Reprints and permissions

About this article

Cite this article

Yeo, S., Oh, S. & Lee, M. Accelerated deep reinforcement learning with efficient demonstration utilization techniques. World Wide Web 24, 1275–1297 (2021). https://doi.org/10.1007/s11280-019-00763-0

Download citation

Received: 01 May 2019
Revised: 27 August 2019
Accepted: 06 November 2019
Published: 11 February 2020
Issue Date: July 2021
DOI: https://doi.org/10.1007/s11280-019-00763-0

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Accelerated deep reinforcement learning with efficient demonstration utilization techniques

Abstract

Access this article

Similar content being viewed by others

Deep learning: systematic review, models, challenges, and research directions

Multi-agent deep reinforcement learning: a survey

Applications of game theory in deep learning: a survey

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s note

Electronic supplementary material

ESM 1

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Accelerated deep reinforcement learning with efficient demonstration utilization techniques

Abstract

Access this article

Similar content being viewed by others

Deep learning: systematic review, models, challenges, and research directions

Multi-agent deep reinforcement learning: a survey

Applications of game theory in deep learning: a survey

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s note

Electronic supplementary material

ESM 1

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation