Integrating Policy Reuse with Learning from Demonstrations for Knowledge Transfer in Deep Reinforcement Learning

Yao, Pei; Feng, Liang

doi:10.1007/978-3-030-92307-5_77

Pei Yao¹⁰ &
Liang Feng¹⁰

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 1516))

Included in the following conference series:

International Conference on Neural Information Processing

2302 Accesses

Abstract

Transfer learning (TL) assisted deep reinforcement learning (DRL) has attracted much attention in recent years, which aims to enhance reinforcement learning performance by leveraging prior knowledge from past learned tasks. However, it still remains challenging to conduct positive knowledge transfer when the target tasks are dissimilar to the source tasks, e.g., the source and target tasks possess diverse environmental dynamics. Taking this cue, this paper presents an attempt to explore TL in DRL across tasks with heterogeneous dynamics towards enhanced reinforcement learning performance. In particular, we propose to combine policy reuse and learning from demonstrations for knowledge transfer in DRL. It allows multiple learned policies in separate source tasks to adaptively fuse to generate a teacher policy for the target task, which will be further used for knowledge transfer via learning from demonstrations to boost the learning process of the target DRL agent. To evaluate the performance of our proposed method, comprehensive empirical studies have been conducted on continuous control tasks, i.e., Reacher and HalfCheetah. The obtained results show that the proposed method is superior in contrast to recently proposed algorithms in terms of both accumulated reward and training computational cost.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 99.00; Price excludes VAT (USA)

Softcover Book: USD 129.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Mnih, V., et al.: Human-level control through deep reinforcement learning. Nature 518(7540), 529–533 (2015)
Google Scholar
Levine, S., Finn, C., Darrell, T., Abbeel, P.: End-to-end training of deep visuomotor policies. J. Mach. Learn. Res. 17(1), 1334–1373 (2016)
MathSciNet MATH Google Scholar
Kiran, B.R.: Deep reinforcement learning for autonomous driving: a survey. IEEE Trans. Intell. Transp. Syst., 1–18 (2021)
Google Scholar
Taylor, M.E., Stone, P.: Transfer learning for reinforcement learning domains: a survey. J. Mach. Learn. Res., 10(7) (2009)
Google Scholar
Lazaric, A.: Transfer in reinforcement learning: a framework and a survey. In: Wiering, M., van Otterlo, M. (eds.) Reinforcement Learning. Adaptation, Learning, and Optimization, vol 12. Springer, Berlin (2012)
Google Scholar
Lazaric, A., Restelli, M., Bonarini, A.: Transfer of samples in batch reinforcement learning. In: Proceedings of the 25th international conference on Machine learning, pp. 544–551 (2008)
Google Scholar
Fernández, F., Veloso, M.: Probabilistic policy reuse in a reinforcement learning agent. In: Proceedings of the fifth international joint conference on Autonomous agents and multiagent systems, pp. 720–727 (2006)
Google Scholar
Barreto, A.: Successor features for transfer in reinforcement learning. arXiv preprint arXiv:1606.05312 (2016)
Laroche, R., Barlier, M.: Transfer reinforcement learning with shared dynamics. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 31 (2017)
Google Scholar
Rajendran, J., Srinivas, A., Khapra, M.M., Prasanna, P., Ravindran, B.: Attend, adapt and transfer: attentive deep architecture for adaptive transfer from multiple sources in the same domain. In: 5th International Conference on Learning Representations, ICLR 2017, Toulon, France, April 24–26, 2017, Conference Track Proceedings. OpenReview.net (2017)
Google Scholar
Sutton, R.S., Barto, A.G.: Reinforcement Learning: An Introduction. MIT press, London (2018)
Google Scholar
Schulman, J., Wolski, F., Dhariwal, P., Radford, A., Klimov, O.: Proximal policy optimization algorithms. arXiv preprint arXiv:1707.06347 (2017)
Bahdanau, D., Cho, K., Bengio, Y.: Neural machine translation by jointly learning to align and translate. arXiv preprint arXiv:1409.0473 (2014)
Brockman, G., et al.: Openai gym. arXiv preprint arXiv:1606.01540 (2016)
Henderson, P., Chang, W.D., Shkurti, F., Hansen, J., Meger, D., Dudek, G.: Benchmark environments for multitask learning in continuous domains. arXiv preprint arXiv:1708.04352 (2017)

Download references

Acknowledgement

This work is partially supported by the National Natural Science Foundation of China (NSFC) under Grant No. 61876025.

Author information

Authors and Affiliations

College of Computer Science, Chongqing University, Chongqing, China
Pei Yao & Liang Feng

Authors

Pei Yao
View author publications
You can also search for this author in PubMed Google Scholar
Liang Feng
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Liang Feng .

Editor information

Editors and Affiliations

Sampoerna University, Jakarta, Indonesia
Teddy Mantoro
Kyungpook National University, Daegu, Korea (Republic of)
Minho Lee
Sampoerna University, Jakarta, Indonesia
Media Anugerah Ayu
Murdoch University, Murdoch, WA, Australia
Kok Wai Wong
Universitas Indonesia, Depok, Indonesia
Achmad Nizar Hidayanto

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Yao, P., Feng, L. (2021). Integrating Policy Reuse with Learning from Demonstrations for Knowledge Transfer in Deep Reinforcement Learning. In: Mantoro, T., Lee, M., Ayu, M.A., Wong, K.W., Hidayanto, A.N. (eds) Neural Information Processing. ICONIP 2021. Communications in Computer and Information Science, vol 1516. Springer, Cham. https://doi.org/10.1007/978-3-030-92307-5_77

Download citation

DOI: https://doi.org/10.1007/978-3-030-92307-5_77
Published: 02 December 2021
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-92306-8
Online ISBN: 978-3-030-92307-5
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Integrating Policy Reuse with Learning from Demonstrations for Knowledge Transfer in Deep Reinforcement Learning