Abstract
Deep reinforcement learning (DRL) combines the benefits of deep learning and reinforcement learning. However, it still requires long training times and a large number of instances to reach an acceptable performance. Transfer learning (TL) offers an alternative to reduce the training time of DRL agents, using less instances and in some cases improving performance. In this work, we propose a transfer learning formulation for DRL across tasks. A relevant problem of TL that we address herein is how to select a proper pre-trained model that will be useful for the target task. We consider the entropy of feature maps in the hidden layers of the convolutional neural network and their actions spaces as relevant features to select a pre-trained model that is then fine-tuned for the target task. We report experimental results of the proposed source task selection methodology when using Deep Q-Networks for learning to play Atari games. Nevertheless, the proposed method could be used in other DRL algorithms (e.g., DDQN, C51, etc.) and also other domains. Results reveal that most of the time our proposed method is capable of selecting source tasks that improve the performance of a model trained from scratch. Additionally, we introduce a method for selecting the most relevant kernels for the target task, the results show that transferring a subset of the convolutional kernels results in similar performance to training the model from scratch while using less parameters.





Similar content being viewed by others
Explore related subjects
Discover the latest articles and news from researchers in related subjects, suggested using machine learning.Code availability
The code is available in the next repository: https://github.com/gr-jesus/Source-Task-Selection.
References
Bellemare MG, Dabney W, Munos R (2017) A distributional perspective on reinforcement learning. In: Proceedings of the 34th international conference on machine learning-volume 70, pp 449–458. JMLR. org
Bellemare MG, Naddaf Y, Veness J, Bowling M (2013) The arcade learning environment: an evaluation platform for general agents. JAIR 47:253–279
Breiman L (2001) Random forests. Mach Learn 45(1):5–32
Buitinck L, Louppe G, Blondel M, Pedregosa F, Mueller A, Grisel O, Niculae V, Prettenhofer P, Gramfort A, Grobler J, Layton R, VanderPlas J, Joly A, Holt B, Varoquaux G (2013) API design for machine learning software: experiences from the scikit-learn project. In: ECML PKDD workshop: languages for data mining and machine learning, pp 108–122
Carr T, Chli M, Vogiatzis G (2018) Domain adaptation for reinforcement learning on the atari. arXiv preprint arXiv:1812.07452
Carroll JL, Seppi K (2005) Task similarity measures for transfer in reinforcement learning task libraries. In: Proceedings 2005 IEEE international joint conference on neural networks, vol. 2, pp 803–808. IEEE
Castro PS, Moitra S, Gelada C, Kumar S, Bellemare MG (2018) Dopamine: a research framework for deep reinforcement learning. arXiv:1812.06110
Chang CC, Lin CJ (2011) Libsvm: a library for support vector machines. ACM Trans Intell Syst Technol (TIST) 2(3):1–27
Du Y, Gabriel V, Irwin J, Taylor ME (2016) Initial progress in transfer for deep reinforcement learning algorithms. In: Proceedings of deep reinforcement learning: frontiers and challenges workshop, New York City, NY, USA
Hessel M, Modayil J, Van Hasselt H, Schaul T, Ostrovski G, Dabney W, Horgan D, Piot B, Azar M, Silver D (2018) Rainbow: combining improvements in deep reinforcement learning. In: Proc. of AAAI
Machado MC, Bellemare MG, Talvitie E, Veness J, Hausknecht M, Bowling M (2018) Revisiting the arcade learning environment: evaluation protocols and open problems for general agents. JAIR 61:523–562
Mitchell T (1997) Introduction to machine learning. Mach Learn 7:2–5
Mittel A, Munukutla S, Yadav H (2018) Visual transfer between atari games using competitive reinforcement learning. arXiv preprint arXiv:1809.00397
Mnih V, Kavukcuoglu K, Silver D, Rusu AA, Veness J, Bellemare MG, Graves A, Riedmiller M, Fidjeland AK, Ostrovski G et al (2015) Human-level control through deep reinforcement learning. Nature 518(7540):529
Pan SJ, Yang Q (2009) A survey on transfer learning. IEEE Trans Knowl Data Eng 22(10):1345–1359
Parisotto E, Ba JL, Salakhutdinov R (2016) Actor-mimic: deep multitask and transfer reinforcement learning. arXiv preprint arXiv:1511.06342
Rusu AA, Colmenarejo SG, Gulcehre C, Desjardins G, Kirkpatrick J, Pascanu R, Mnih V, Kavukcuoglu K, Hadsell R (2015) Policy distillation. arXiv preprint arXiv:1511.06295
Rusu AA, Rabinowitz NC, Desjardins G, Soyer H, Kirkpatrick J, Kavukcuoglu K, Pascanu R, Hadsell R (2016) Progressive neural networks. arXiv preprint arXiv:1606.04671
Schaul T, Quan J, Antonoglou I, Silver D (2015) Prioritized experience replay. arXiv preprint arXiv:1511.05952
Schmitt S, Hudson JJ, Zidek A, Osindero S, Doersch C, Czarnecki WM, Leibo JZ, Kuttler H, Zisserman A, Simonyan K, et al. (2018) Kickstarting deep reinforcement learning. arXiv preprint arXiv:1803.03835
Silver D, Huang A, Maddison CJ, Guez A, Sifre L, Van Den Driessche G, Schrittwieser J, Antonoglou I, Panneershelvam V, Lanctot M et al (2016) Mastering the game of go with deep neural networks and tree search. Nature 529(7587):484
Sutton RS, Barto AG et al (2018) Introduction to reinforcement learning. MIT Press, Cambridge
Taylor ME, Stone P (2009) Transfer learning for reinforcement learning domains: a survey. JMLR 10(Jul):1633–1685
Wang Z, Schaul T, Hessel M, Van Hasselt H, Lanctot M, De Freitas N (2015) Dueling network architectures for deep reinforcement learning. arXiv preprint arXiv:1511.06581
Yang Q, Zhang Y, Dai W, Pan SJ (2020) Transfer learning. Cambridge University Press, Cambridge
Yin H, Pan SJ (2017) Knowledge transfer for deep reinforcement learning with hierarchical experience replay. In: AAAI, pp 1640–1646
Zambaldi V, Raposo D, Santoro A, Bapst V, Li Y, Babuschkin I, Tuyls K, Reichert D, Lillicrap T, Lockhart E, et al (2018) Deep reinforcement learning with relational inductive biases. In: International conference on learning representations
Acknowledgements
The authors thankfully acknowledge the computer resources, technical advice and support provided by Laboratorio Nacional de Supercómputo del Sureste de México (LNS), a member of CONACYT national laboratories with the Projects No. 201901047C and 202002030c. We also want to acknowledge the Laboratorio Nacional de Supercómputo del Bajio with the Project No. 2020.1. Jesús García-Ramírez acknowledges CONACYT for the scholarship that supports his PhD studies associated to CVU number 701191. Hugo Jair Escalante was supported by project grant CONACYT CB-S-26314.
Funding
This work was supported by CONACyT Ph.d. scholarship associated to the CVU number 701191 and the Project Grant CONACYT CB-S-26314.
Author information
Authors and Affiliations
Contributions
JG-R, EFM, HJE are contributed equally to this study.
Corresponding author
Ethics declarations
Conflicts of interest
The authors declare that they have no conflicts of interest.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
García-Ramírez, J., Morales, E.F. & Escalante, H.J. Source tasks selection for transfer deep reinforcement learning: a case of study on Atari games. Neural Comput & Applic 35, 18099–18111 (2023). https://doi.org/10.1007/s00521-021-06419-3
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00521-021-06419-3