Flexible Data Augmentation in Off-Policy Reinforcement Learning

Rak, Alexandra; Skrynnik, Alexey; Panov, Aleksandr I.

doi:10.1007/978-3-030-87986-0_20

Alexandra Rak¹⁴,
Alexey Skrynnik^14,15 &
Aleksandr I. Panov^14,15

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 12854))

Included in the following conference series:

International Conference on Artificial Intelligence and Soft Computing

1011 Accesses

Abstract

This paper explores an application of image augmentation in reinforcement learning tasks - a popular regularization technique in the computer vision area. The analysis is based on the model-free off-policy algorithms. As a regularization, we consider the augmentation of the frames that are sampled from the replay buffer of the model. Evaluated augmentation techniques are random changes in image contrast, random shifting, random cutting, and others. Research is done using the environments of the Atari games: Breakout, Space Invaders, Berzerk, Wizard of Wor, Demon Attack. Using augmentations allowed us to obtain results confirming the significant acceleration of the model’s algorithm convergence. We also proposed an adaptive mechanism for selecting the type of augmentation depending on the type of task being performed by the agent.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 79.99; Price excludes VAT (USA)

Softcover Book: USD 99.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Bellemare, M.G., Dabney, W., Munos, R.: A distributional perspective on reinforcement learning. arXiv preprint arXiv:1707.06887 (2017)
Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation policies from data. arXiv preprint arXiv:1805.09501 (2018)
De Asis, K., Hernandez-Garcia, J.F., Holland, G.Z., Sutton, R.S.: Multi-step reinforcement learning: A unifying algorithm. arXiv preprint arXiv:1703.01327 (2017)
Fortunato, M., et al.: Noisy networks for exploration. arXiv preprint arXiv:1706.10295 (2017)
Goodfellow, I., Bengio, Y., Courville, A.: Deep Learning. MIT Press (2016). http://www.deeplearningbook.org
Hessel, M., et al.: Rainbow: Combining improvements in deep reinforcement learning. arXiv preprint arXiv:1710.02298 (2017)
Hester, T., et al.: Deep q-learning from demonstrations. arXiv preprint arXiv:1704.03732 (2017)
Kostrikov, I., Yarats, D., Fergus, R.: Image augmentation is all you need: Regularizing deep reinforcement learning from pixels. arXiv preprint arXiv:2004.13649 (2020)
Laskin, M., Lee, K., Stooke, A., Pinto, L., Abbeel, P., Srinivas, A.: Reinforcement learning with augmented data. arXiv preprint arXiv:2004.14990 (2020)
Mnih, V., Kavukcuoglu, K., Silver, D., Graves, A., Antonoglou, I., Wierstra, D., Riedmiller, M.: Playing atari with deep reinforcement learning. arXiv preprint arXiv:1312.5602 (2013)
Perez, L., Wang, J.: The effectiveness of data augmentation in image classification using deep learning. arXiv preprint arXiv:1712.04621 (2017)
Raileanu, R., Goldstein, M., Yarats, D., Kostrikov, I., Fergus, R.: Automatic data augmentation for generalization in deep reinforcement learning. arXiv preprint arXiv:2006.12862 (2020)
Schaul, T., Quan, J., Antonoglou, I., Silver, D.: Prioritized experience replay. arXiv preprint arXiv:1511.05952 (2015)
Schulman, J., Wolski, F., Dhariwal, P., Radford, A., Klimov, O.: Proximal policy optimization algorithms. arXiv preprint arXiv:1707.06347 (2017)
Shorten, C., Khoshgoftaar, T.M.: A survey on image data augmentation for deep learning. J. Big Data 6(1), 60 (2019)
Article Google Scholar
Skrynnik, A., Staroverov, A., Aitygulov, E., Aksenov, K., Davydov, V., Panov, A.I.: Forgetful experience replay in hierarchical reinforcement learning from expert demonstrations. Knowledge-Based Systems 218, 106844 (2021), https://linkinghub.elsevier.com/retrieve/pii/S0950705121001076 https://arxiv.org/abs/2006.09939
Skrynnik, A., Staroverov, A., Aitygulov, E., Aksenov, K., Davydov, V., Panov, A.I.: Hierarchical Deep Q-Network from imperfect demonstrations in Minecraft. Cognitive Syst. Res. 65, 74–78 (2021). https://arxiv.org/pdf/1912.08664.pdf www.sciencedirect.com/science/article/pii/S1389041720300723?via%3Dihub www.scopus.com/record/display.uri?eid=2-s2.0-85094320898&origin=resultslist linkinghub.elsevier.com/retrieve/pii/S138904172
Van Hasselt, H., Guez, A., Silver, D.: Deep reinforcement learning with double q-learning. arXiv preprint arXiv:1509.06461 (2015)
Wang, Z., Schaul, T., Hessel, M., Hasselt, H., Lanctot, M., Freitas, N.: Dueling network architectures for deep reinforcement learning. In: International Conference on Machine Learning, pp. 1995–2003 (2016)
Google Scholar
Watkins, C.J., Dayan, P.: Q-learning. Mach. Learn. 8(3–4), 279–292 (1992)
Google Scholar

Download references

Acknowledgements

The reported study was supported by the Ministry of Education and Science of the Russian Federation, project No. 075-15-2020-799.

Author information

Authors and Affiliations

Moscow Institute of Physics and Technology, Moscow, Russia
Alexandra Rak, Alexey Skrynnik & Aleksandr I. Panov
Federal Research Center Computer Science and Control of the Russian Academy of Sciences, Moscow, Russia
Alexey Skrynnik & Aleksandr I. Panov

Authors

Alexandra Rak
View author publications
You can also search for this author in PubMed Google Scholar
Alexey Skrynnik
View author publications
You can also search for this author in PubMed Google Scholar
Aleksandr I. Panov
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Czestochowa University of Technology, Częstochowa, Poland
Leszek Rutkowski
Częstochowa University of Technology, Częstochowa, Poland
Rafał Scherer
Częstochowa University of Technology, Częstochowa, Poland
Marcin Korytkowski
Edmonton, AB, Canada
Witold Pedrycz
AGH University of Science and Technology, Krakow, Poland
Ryszard Tadeusiewicz
Electrical and Computer Engineering, University of Louisville, Louisville, KY, USA
Jacek M. Zurada

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Rak, A., Skrynnik, A., Panov, A.I. (2021). Flexible Data Augmentation in Off-Policy Reinforcement Learning. In: Rutkowski, L., Scherer, R., Korytkowski, M., Pedrycz, W., Tadeusiewicz, R., Zurada, J.M. (eds) Artificial Intelligence and Soft Computing. ICAISC 2021. Lecture Notes in Computer Science(), vol 12854. Springer, Cham. https://doi.org/10.1007/978-3-030-87986-0_20

Download citation

DOI: https://doi.org/10.1007/978-3-030-87986-0_20
Published: 05 October 2021
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-87985-3
Online ISBN: 978-3-030-87986-0
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics