Abstract
Transfer learning (TL) assisted deep reinforcement learning (DRL) has attracted much attention in recent years, which aims to enhance reinforcement learning performance by leveraging prior knowledge from past learned tasks. However, it still remains challenging to conduct positive knowledge transfer when the target tasks are dissimilar to the source tasks, e.g., the source and target tasks possess diverse environmental dynamics. Taking this cue, this paper presents an attempt to explore TL in DRL across tasks with heterogeneous dynamics towards enhanced reinforcement learning performance. In particular, we propose to combine policy reuse and learning from demonstrations for knowledge transfer in DRL. It allows multiple learned policies in separate source tasks to adaptively fuse to generate a teacher policy for the target task, which will be further used for knowledge transfer via learning from demonstrations to boost the learning process of the target DRL agent. To evaluate the performance of our proposed method, comprehensive empirical studies have been conducted on continuous control tasks, i.e., Reacher and HalfCheetah. The obtained results show that the proposed method is superior in contrast to recently proposed algorithms in terms of both accumulated reward and training computational cost.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Mnih, V., et al.: Human-level control through deep reinforcement learning. Nature 518(7540), 529–533 (2015)
Levine, S., Finn, C., Darrell, T., Abbeel, P.: End-to-end training of deep visuomotor policies. J. Mach. Learn. Res. 17(1), 1334–1373 (2016)
Kiran, B.R.: Deep reinforcement learning for autonomous driving: a survey. IEEE Trans. Intell. Transp. Syst., 1–18 (2021)
Taylor, M.E., Stone, P.: Transfer learning for reinforcement learning domains: a survey. J. Mach. Learn. Res., 10(7) (2009)
Lazaric, A.: Transfer in reinforcement learning: a framework and a survey. In: Wiering, M., van Otterlo, M. (eds.) Reinforcement Learning. Adaptation, Learning, and Optimization, vol 12. Springer, Berlin (2012)
Lazaric, A., Restelli, M., Bonarini, A.: Transfer of samples in batch reinforcement learning. In: Proceedings of the 25th international conference on Machine learning, pp. 544–551 (2008)
Fernández, F., Veloso, M.: Probabilistic policy reuse in a reinforcement learning agent. In: Proceedings of the fifth international joint conference on Autonomous agents and multiagent systems, pp. 720–727 (2006)
Barreto, A.: Successor features for transfer in reinforcement learning. arXiv preprint arXiv:1606.05312 (2016)
Laroche, R., Barlier, M.: Transfer reinforcement learning with shared dynamics. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 31 (2017)
Rajendran, J., Srinivas, A., Khapra, M.M., Prasanna, P., Ravindran, B.: Attend, adapt and transfer: attentive deep architecture for adaptive transfer from multiple sources in the same domain. In: 5th International Conference on Learning Representations, ICLR 2017, Toulon, France, April 24–26, 2017, Conference Track Proceedings. OpenReview.net (2017)
Sutton, R.S., Barto, A.G.: Reinforcement Learning: An Introduction. MIT press, London (2018)
Schulman, J., Wolski, F., Dhariwal, P., Radford, A., Klimov, O.: Proximal policy optimization algorithms. arXiv preprint arXiv:1707.06347 (2017)
Bahdanau, D., Cho, K., Bengio, Y.: Neural machine translation by jointly learning to align and translate. arXiv preprint arXiv:1409.0473 (2014)
Brockman, G., et al.: Openai gym. arXiv preprint arXiv:1606.01540 (2016)
Henderson, P., Chang, W.D., Shkurti, F., Hansen, J., Meger, D., Dudek, G.: Benchmark environments for multitask learning in continuous domains. arXiv preprint arXiv:1708.04352 (2017)
Acknowledgement
This work is partially supported by the National Natural Science Foundation of China (NSFC) under Grant No. 61876025.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 Springer Nature Switzerland AG
About this paper
Cite this paper
Yao, P., Feng, L. (2021). Integrating Policy Reuse with Learning from Demonstrations for Knowledge Transfer in Deep Reinforcement Learning. In: Mantoro, T., Lee, M., Ayu, M.A., Wong, K.W., Hidayanto, A.N. (eds) Neural Information Processing. ICONIP 2021. Communications in Computer and Information Science, vol 1516. Springer, Cham. https://doi.org/10.1007/978-3-030-92307-5_77
Download citation
DOI: https://doi.org/10.1007/978-3-030-92307-5_77
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-92306-8
Online ISBN: 978-3-030-92307-5
eBook Packages: Computer ScienceComputer Science (R0)