Skip to main content

Integrating Policy Reuse with Learning from Demonstrations for Knowledge Transfer in Deep Reinforcement Learning

  • Conference paper
  • First Online:
Neural Information Processing (ICONIP 2021)

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 1516))

Included in the following conference series:

  • 2302 Accesses

Abstract

Transfer learning (TL) assisted deep reinforcement learning (DRL) has attracted much attention in recent years, which aims to enhance reinforcement learning performance by leveraging prior knowledge from past learned tasks. However, it still remains challenging to conduct positive knowledge transfer when the target tasks are dissimilar to the source tasks, e.g., the source and target tasks possess diverse environmental dynamics. Taking this cue, this paper presents an attempt to explore TL in DRL across tasks with heterogeneous dynamics towards enhanced reinforcement learning performance. In particular, we propose to combine policy reuse and learning from demonstrations for knowledge transfer in DRL. It allows multiple learned policies in separate source tasks to adaptively fuse to generate a teacher policy for the target task, which will be further used for knowledge transfer via learning from demonstrations to boost the learning process of the target DRL agent. To evaluate the performance of our proposed method, comprehensive empirical studies have been conducted on continuous control tasks, i.e., Reacher and HalfCheetah. The obtained results show that the proposed method is superior in contrast to recently proposed algorithms in terms of both accumulated reward and training computational cost.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 99.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 129.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Mnih, V., et al.: Human-level control through deep reinforcement learning. Nature 518(7540), 529–533 (2015)

    Google Scholar 

  2. Levine, S., Finn, C., Darrell, T., Abbeel, P.: End-to-end training of deep visuomotor policies. J. Mach. Learn. Res. 17(1), 1334–1373 (2016)

    MathSciNet  MATH  Google Scholar 

  3. Kiran, B.R.: Deep reinforcement learning for autonomous driving: a survey. IEEE Trans. Intell. Transp. Syst., 1–18 (2021)

    Google Scholar 

  4. Taylor, M.E., Stone, P.: Transfer learning for reinforcement learning domains: a survey. J. Mach. Learn. Res., 10(7) (2009)

    Google Scholar 

  5. Lazaric, A.: Transfer in reinforcement learning: a framework and a survey. In: Wiering, M., van Otterlo, M. (eds.) Reinforcement Learning. Adaptation, Learning, and Optimization, vol 12. Springer, Berlin (2012)

    Google Scholar 

  6. Lazaric, A., Restelli, M., Bonarini, A.: Transfer of samples in batch reinforcement learning. In: Proceedings of the 25th international conference on Machine learning, pp. 544–551 (2008)

    Google Scholar 

  7. Fernández, F., Veloso, M.: Probabilistic policy reuse in a reinforcement learning agent. In: Proceedings of the fifth international joint conference on Autonomous agents and multiagent systems, pp. 720–727 (2006)

    Google Scholar 

  8. Barreto, A.: Successor features for transfer in reinforcement learning. arXiv preprint arXiv:1606.05312 (2016)

  9. Laroche, R., Barlier, M.: Transfer reinforcement learning with shared dynamics. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 31 (2017)

    Google Scholar 

  10. Rajendran, J., Srinivas, A., Khapra, M.M., Prasanna, P., Ravindran, B.: Attend, adapt and transfer: attentive deep architecture for adaptive transfer from multiple sources in the same domain. In: 5th International Conference on Learning Representations, ICLR 2017, Toulon, France, April 24–26, 2017, Conference Track Proceedings. OpenReview.net (2017)

    Google Scholar 

  11. Sutton, R.S., Barto, A.G.: Reinforcement Learning: An Introduction. MIT press, London (2018)

    Google Scholar 

  12. Schulman, J., Wolski, F., Dhariwal, P., Radford, A., Klimov, O.: Proximal policy optimization algorithms. arXiv preprint arXiv:1707.06347 (2017)

  13. Bahdanau, D., Cho, K., Bengio, Y.: Neural machine translation by jointly learning to align and translate. arXiv preprint arXiv:1409.0473 (2014)

  14. Brockman, G., et al.: Openai gym. arXiv preprint arXiv:1606.01540 (2016)

  15. Henderson, P., Chang, W.D., Shkurti, F., Hansen, J., Meger, D., Dudek, G.: Benchmark environments for multitask learning in continuous domains. arXiv preprint arXiv:1708.04352 (2017)

Download references

Acknowledgement

This work is partially supported by the National Natural Science Foundation of China (NSFC) under Grant No. 61876025.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Liang Feng .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2021 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Yao, P., Feng, L. (2021). Integrating Policy Reuse with Learning from Demonstrations for Knowledge Transfer in Deep Reinforcement Learning. In: Mantoro, T., Lee, M., Ayu, M.A., Wong, K.W., Hidayanto, A.N. (eds) Neural Information Processing. ICONIP 2021. Communications in Computer and Information Science, vol 1516. Springer, Cham. https://doi.org/10.1007/978-3-030-92307-5_77

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-92307-5_77

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-92306-8

  • Online ISBN: 978-3-030-92307-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics