Skip to main content

Advertisement

Log in

Source tasks selection for transfer deep reinforcement learning: a case of study on Atari games

  • S.I. : LatinX in AI Research
  • Published:
Neural Computing and Applications Aims and scope Submit manuscript

Abstract

Deep reinforcement learning (DRL) combines the benefits of deep learning and reinforcement learning. However, it still requires long training times and a large number of instances to reach an acceptable performance. Transfer learning (TL) offers an alternative to reduce the training time of DRL agents, using less instances and in some cases improving performance. In this work, we propose a transfer learning formulation for DRL across tasks. A relevant problem of TL that we address herein is how to select a proper pre-trained model that will be useful for the target task. We consider the entropy of feature maps in the hidden layers of the convolutional neural network and their actions spaces as relevant features to select a pre-trained model that is then fine-tuned for the target task. We report experimental results of the proposed source task selection methodology when using Deep Q-Networks for learning to play Atari games. Nevertheless, the proposed method could be used in other DRL algorithms (e.g., DDQN, C51, etc.) and also other domains. Results reveal that most of the time our proposed method is capable of selecting source tasks that improve the performance of a model trained from scratch. Additionally, we introduce a method for selecting the most relevant kernels for the target task, the results show that transferring a subset of the convolutional kernels results in similar performance to training the model from scratch while using less parameters.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

Similar content being viewed by others

Explore related subjects

Discover the latest articles and news from researchers in related subjects, suggested using machine learning.

Code availability

The code is available in the next repository: https://github.com/gr-jesus/Source-Task-Selection.

Notes

  1. https://github.com/gr-jesus/Source-Task-Selection.

References

  1. Bellemare MG, Dabney W, Munos R (2017) A distributional perspective on reinforcement learning. In: Proceedings of the 34th international conference on machine learning-volume 70, pp 449–458. JMLR. org

  2. Bellemare MG, Naddaf Y, Veness J, Bowling M (2013) The arcade learning environment: an evaluation platform for general agents. JAIR 47:253–279

    Article  Google Scholar 

  3. Breiman L (2001) Random forests. Mach Learn 45(1):5–32

    Article  MATH  Google Scholar 

  4. Buitinck L, Louppe G, Blondel M, Pedregosa F, Mueller A, Grisel O, Niculae V, Prettenhofer P, Gramfort A, Grobler J, Layton R, VanderPlas J, Joly A, Holt B, Varoquaux G (2013) API design for machine learning software: experiences from the scikit-learn project. In: ECML PKDD workshop: languages for data mining and machine learning, pp 108–122

  5. Carr T, Chli M, Vogiatzis G (2018) Domain adaptation for reinforcement learning on the atari. arXiv preprint arXiv:1812.07452

  6. Carroll JL, Seppi K (2005) Task similarity measures for transfer in reinforcement learning task libraries. In: Proceedings 2005 IEEE international joint conference on neural networks, vol. 2, pp 803–808. IEEE

  7. Castro PS, Moitra S, Gelada C, Kumar S, Bellemare MG (2018) Dopamine: a research framework for deep reinforcement learning. arXiv:1812.06110

  8. Chang CC, Lin CJ (2011) Libsvm: a library for support vector machines. ACM Trans Intell Syst Technol (TIST) 2(3):1–27

    Article  Google Scholar 

  9. Du Y, Gabriel V, Irwin J, Taylor ME (2016) Initial progress in transfer for deep reinforcement learning algorithms. In: Proceedings of deep reinforcement learning: frontiers and challenges workshop, New York City, NY, USA

  10. Hessel M, Modayil J, Van Hasselt H, Schaul T, Ostrovski G, Dabney W, Horgan D, Piot B, Azar M, Silver D (2018) Rainbow: combining improvements in deep reinforcement learning. In: Proc. of AAAI

  11. Machado MC, Bellemare MG, Talvitie E, Veness J, Hausknecht M, Bowling M (2018) Revisiting the arcade learning environment: evaluation protocols and open problems for general agents. JAIR 61:523–562

    Article  MathSciNet  MATH  Google Scholar 

  12. Mitchell T (1997) Introduction to machine learning. Mach Learn 7:2–5

    Google Scholar 

  13. Mittel A, Munukutla S, Yadav H (2018) Visual transfer between atari games using competitive reinforcement learning. arXiv preprint arXiv:1809.00397

  14. Mnih V, Kavukcuoglu K, Silver D, Rusu AA, Veness J, Bellemare MG, Graves A, Riedmiller M, Fidjeland AK, Ostrovski G et al (2015) Human-level control through deep reinforcement learning. Nature 518(7540):529

    Article  Google Scholar 

  15. Pan SJ, Yang Q (2009) A survey on transfer learning. IEEE Trans Knowl Data Eng 22(10):1345–1359

    Article  Google Scholar 

  16. Parisotto E, Ba JL, Salakhutdinov R (2016) Actor-mimic: deep multitask and transfer reinforcement learning. arXiv preprint arXiv:1511.06342

  17. Rusu AA, Colmenarejo SG, Gulcehre C, Desjardins G, Kirkpatrick J, Pascanu R, Mnih V, Kavukcuoglu K, Hadsell R (2015) Policy distillation. arXiv preprint arXiv:1511.06295

  18. Rusu AA, Rabinowitz NC, Desjardins G, Soyer H, Kirkpatrick J, Kavukcuoglu K, Pascanu R, Hadsell R (2016) Progressive neural networks. arXiv preprint arXiv:1606.04671

  19. Schaul T, Quan J, Antonoglou I, Silver D (2015) Prioritized experience replay. arXiv preprint arXiv:1511.05952

  20. Schmitt S, Hudson JJ, Zidek A, Osindero S, Doersch C, Czarnecki WM, Leibo JZ, Kuttler H, Zisserman A, Simonyan K, et al. (2018) Kickstarting deep reinforcement learning. arXiv preprint arXiv:1803.03835

  21. Silver D, Huang A, Maddison CJ, Guez A, Sifre L, Van Den Driessche G, Schrittwieser J, Antonoglou I, Panneershelvam V, Lanctot M et al (2016) Mastering the game of go with deep neural networks and tree search. Nature 529(7587):484

    Article  Google Scholar 

  22. Sutton RS, Barto AG et al (2018) Introduction to reinforcement learning. MIT Press, Cambridge

    MATH  Google Scholar 

  23. Taylor ME, Stone P (2009) Transfer learning for reinforcement learning domains: a survey. JMLR 10(Jul):1633–1685

    MathSciNet  MATH  Google Scholar 

  24. Wang Z, Schaul T, Hessel M, Van Hasselt H, Lanctot M, De Freitas N (2015) Dueling network architectures for deep reinforcement learning. arXiv preprint arXiv:1511.06581

  25. Yang Q, Zhang Y, Dai W, Pan SJ (2020) Transfer learning. Cambridge University Press, Cambridge

    Book  Google Scholar 

  26. Yin H, Pan SJ (2017) Knowledge transfer for deep reinforcement learning with hierarchical experience replay. In: AAAI, pp 1640–1646

  27. Zambaldi V, Raposo D, Santoro A, Bapst V, Li Y, Babuschkin I, Tuyls K, Reichert D, Lillicrap T, Lockhart E, et al (2018) Deep reinforcement learning with relational inductive biases. In: International conference on learning representations

Download references

Acknowledgements

The authors thankfully acknowledge the computer resources, technical advice and support provided by Laboratorio Nacional de Supercómputo del Sureste de México (LNS), a member of CONACYT national laboratories with the Projects No. 201901047C and 202002030c. We also want to acknowledge the Laboratorio Nacional de Supercómputo del Bajio with the Project No. 2020.1. Jesús García-Ramírez acknowledges CONACYT for the scholarship that supports his PhD studies associated to CVU number 701191. Hugo Jair Escalante was supported by project grant CONACYT CB-S-26314.

Funding

This work was supported by CONACyT Ph.d. scholarship associated to the CVU number 701191 and the Project Grant CONACYT CB-S-26314.

Author information

Authors and Affiliations

Authors

Contributions

JG-R, EFM, HJE are contributed equally to this study.

Corresponding author

Correspondence to Jesús García-Ramírez.

Ethics declarations

Conflicts of interest

The authors declare that they have no conflicts of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

García-Ramírez, J., Morales, E.F. & Escalante, H.J. Source tasks selection for transfer deep reinforcement learning: a case of study on Atari games. Neural Comput & Applic 35, 18099–18111 (2023). https://doi.org/10.1007/s00521-021-06419-3

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00521-021-06419-3

Keywords