Skill based transfer learning with domain adaptation for continuous reinforcement learning domains

Shoeleh, Farzaneh; Asadpour, Masoud

doi:10.1007/s10489-019-01527-z

Skill based transfer learning with domain adaptation for continuous reinforcement learning domains

Published: 30 July 2019

Volume 50, pages 502–518, (2020)
Cite this article

Applied Intelligence Aims and scope Submit manuscript

Farzaneh Shoeleh¹ &
Masoud Asadpour¹

1522 Accesses
9 Citations
Explore all metrics

Abstract

Although reinforcement learning is known as an effective machine learning technique, it might perform poorly in complex problems, especially real-world problems, leading to a slow rate of convergence. This issue magnifies when facing continuous domains where the curse of dimensionality is inevitable, and generalization is mostly desired. Transfer learning is a successful technique to remedy such a problem which results in significant improvements in learning performance by providing generalization not only within a task but also across different but related or similar tasks. The critical issue in transfer learning is how to incorporate the knowledge acquired from learning in a different but related task in the past. Domain adaptation is an exciting paradigm that seeks to address this challenge. In this paper, we propose a novel skill based Transfer Learning with Domain Adaptation (TLDA) approach suitable for continuous RL problems. TLDA discovers and learns skills as high-level knowledge from source task and then uses domain adaptation technique to help agent discover state-action mapping as a relation between the source and target tasks. With such mapping, TLDA can adapt source skills and speed up learning on a new target task. The experimental results verify the achievement of an effective transfer learning method for continuous reinforcement learning problems.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Multi-agent deep reinforcement learning: a survey

Article Open access 15 April 2021

Learning from positive and unlabeled data: a survey

Article 02 April 2020

Emerging trends in federated learning: from model fusion to federated X learning

Article Open access 02 April 2024

References

Ammar H B, Tuyls K, Taylor M E, Driessens K, Weiss G (2012) Reinforcement learning transfer via sparse coding. In: Proceedings of the 11th international conference on autonomous agents and multiagent systems-Volume 1, pp 383–390. International Foundation for Autonomous Agents and Multiagent Systems
Ammar HB, Eaton E, Ruvolo P, Taylor M (2014) Online multi-task learning for policy gradient methods. In: Proceedings of the 31st international conference on machine learning (ICML-14), pp 1206–1214
Ammar HB, Eaton E, Taylor M E, Mocanu D C, Driessens K, Weiss G, Tuyls K (2014) An automated measure of mdp similarity for transfer in reinforcement learning. In: Workshops at the 28th AAAI conference on artificial intelligence
Ammar H B, Eaton E, Ruvolo P, Taylor M E (2015) Unsupervised cross-domain transfer in policy gradient reinforcement learning via manifold alignment. In: Proc. of AAAI
Asadi M, Huber M (2007) Effective control knowledge transfer through learning skill and representation hierarchies. In: 20th international joint conference on artificial intelligence, number Icml, pp 2054–2059
Asadi M, Huber M (2015) A dynamic hierarchical task transfer in multiple robot explorations. In: Proceedings on the international conference on artificial intelligence (ICAI), vol 8, pp 22–27
Barreto A, Dabney W, Munos R, Hunt J J, Schaul T, van Hasselt H P, Silver D (2017) Successor features for transfer in reinforcement learning. In: Advances in neural information processing systems, pp 4055–4065
Barreto A, Borsa D, Quan J, Schaul T, Silver D, Hessel M, Mankowitz D, Zidek A, Munos R (2018) Transfer in deep reinforcement learning using successor features and generalised policy improvement. In: International Conference on Machine Learning, pages 510–519
Beijbom O (2012) Domain adaptations for computer vision applications. arXiv:1211.4860
Bocsi B, Csató L, Peters J (2013) Alignment-based transfer learning for robot models. In: The 2013 international joint conference on neural networks (IJCNN), pp 1–7. IEEE
Culotta A (2016) Training a text classifier with a single word using twitter lists and domain adaptation. Soc Netw Anal Min 6(1):1–15
Article Google Scholar
Dabney W, Barto A G (2012) Adaptive step-size for online temporal difference learning
Dayan P (1993) Improving generalization for temporal difference learning: The successor representation. Neural Comput 5(4):613–624
Article Google Scholar
Fang M, Guo Y, Zhang X, Li X (2015) Multi-source transfer learning based on label shared subspace. Pattern Recogn Lett 51:101–106
Article Google Scholar
Ferns N, Panangaden P, Precup D (2011) Bisimulation metrics for continuous markov decision processes. SIAM J Comput 40(6):1662–1714
Article MathSciNet Google Scholar
Florensa C, Duan Y, Abbeel P (2017) Stochastic neural networks for hierarchical reinforcement learning. arXiv:1704.03012
Florensa C, Held D, Geng X, Abbeel P (2018) Automatic goal generation for reinforcement learning agents. In: International conference on machine learning, pp 1514–1523
Glorot X, Bordes A, Bengio Y (2011) Domain adaptation for large-scale sentiment classification: A deep learning approach. In: Proceedings of the 28th international conference on machine learning (ICML-11), pp 513–520
Gretton A, Borgwardt K M, Rasch M J, Schölkopf B, Smola A (2008) A kernel method for the two-sample problem. J Mach Learn Res 1:1–10
MATH Google Scholar
Gupta A, Devin C, Liu Y, Abbeel P, Levine S (2017) Learning invariant feature spaces to transfer skills with reinforcement learning. arXiv:1703.02949
Heess N, Wayne G, Tassa Y, Lillicrap T, Riedmiller M, Silver D (2016) Learning and transfer of modulated locomotor controllers, 2016. arXiv:1610.05182
Held D, Geng X, Florensa C, Abbeel P (2017) Automatic goal generation for reinforcement learning agents. arXiv:1705.06366
Konidaris G, Barto A G (2009) Skill discovery in continuous reinforcement learning domains using skill chaining. In: Advances in neural information processing systems, pages 1015–1023
Konidaris G, Kuindersma S, Grupen R, Barto A G (2010) Constructing skill trees for reinforcement learning agents from demonstration trajectories, pages 1162–1170
Konidaris G, Kuindersma S, Grupen R, Barto A CST: Constructing skill trees by demonstration. In: Proceedings of the ICML workshop on new developments in imitation learning, 2011
Konidaris G, Thomas P, Osentoski S, Thomas P Value Function Approximation in Reinforcement Learning using the Fourier Basis. Proceedings of the 25th conference on artificial intelligence, pp 380–385, 2011
Konidaris G, Kuindersma S, Grupen R, Barto A (2012) Robot learning from demonstration by constructing skill trees. Int J Robot Res 31(3):360–375
Article Google Scholar
Lakshminarayanan A S, Krishnamurthy R, Kumar P, Ravindran B (2016) Option discovery in hierarchical reinforcement learning using spatio-temporal clustering. arXiv:1605.05359. Presented at ICML-16 Workshop on Abstraction in Reinforcement Learning
Lazaric A (2012) Transfer in reinforcement learning: A framework and a survey. Reinforcement Learning 12:143–173
Article Google Scholar
Lazaric A, Restelli M (2011) Transfer from Multiple MDPs. In: Advances in neural information processing systems, pp 1746–1754
Lazaric A, Restelli M, Bonarini A (2008) Transfer of samples in batch reinforcement learning. In: Proceedings of the 25th international conference on machine learning - ICML ’08. ACM Press, New York, pp 544–551
Lehnert L, Tellex S, Littman M L (2017) Advantages and limitations of using successor features for transfer in reinforcement learning. arXiv:1708.00102
Li M, Dai Q (2018) A novel knowledge-leverage-based transfer learning algorithm. Appl Intell 48(8):2355–2372
Article Google Scholar
Liu Y, Stone P (1999) Value-function-based transfer for reinforcement learning using structure mapping. In: Proceedings of the national conference on artificial intelligence, vol 21. AAAI Press, Menlo Park, p 415
Long M, Wang J, Ding G, Shen D, Yang Q (2014) Transfer learning with graph co-regularization. IEEE Trans Knowl Data Eng 26(7):1805–1818
Article Google Scholar
Ma C, Wen J, Bengio Y (2018) Universal successor representations for transfer reinforcement learning. arXiv:1804.03758
Machado M C, Bellemare M G, Bowling M (2017) A laplacian framework for option discovery in reinforcement learning. arXiv:1703.00956
Mahadevan S, Maggioni M (2007) Proto-value functions: A laplacian framework for learning representation and control in markov decision processes. J Mach Learn Res 8(2169-2231):16
MathSciNet MATH Google Scholar
Mao Q, Xue W, Rao Q, Zhang F, Zhan Y (2016) Domain adaptation for speech emotion recognition by sharing priors between related source and target classes. In: 2016 IEEE international conference on acoustics, speech and signal processing (ICASSP), pp 2608–2612. IEEE
McGovern A, Barto A G (2001) Automatic discovery of subgoals in reinforcement learning using diverse density. In: Proceedings of the 18th international conference on machine learning, pp 361–368
Moore A W (1990) Efficient memory-based learning for robot control. University of Cambridge, PhD thesis
Google Scholar
Moradi P, Shiri M E, Entezari N (2010) Automatic skill acquisition in reinforcement learning agents using connection bridge centrality. Communications in Computer and Information Science, 51–62
Moradi P, Shiri M E, Rad A A, Khadivi A, Hasler M (2012) Automatic skill acquisition in reinforcement learning using graph centrality measures. Intelligent Data Analysis 16:113–135
Article Google Scholar
Müller K-R, Mika S, Rätsch G, Tsuda K, Schölkopf B (2001) An introduction to kernel-based learning algorithms. IEEE Trans Neural Netw 12(2):181–201
Article Google Scholar
Nachum O, Gu S, Lee H, Levine S (2018) Data-efficient hierarchical reinforcement learning. arXiv:1805.08296
Pan SJ, Yang Q (2010) A survey on transfer learning. IEEE Trans Knowl Data Eng 22(10):1345–1359
Article Google Scholar
Pan S J, Tsang I W, Kwok J T, Yang Q (2011) Domain adaptation via transfer component analysis. IEEE Trans Neural Netw 22(2):199–210
Article Google Scholar
Patel V M, Gopalan R, Li R, Chellappa R (2015) Visual domain adaptation: A survey of recent advances. IEEE Signal Process Mag 32(3):53–69
Article Google Scholar
Shoeleh F, Asadpour M (2017) Graph based skill acquisition and transfer learning for continuous reinforcement learning domains. Pattern Recogn Lett 87:104–116
Article Google Scholar
Shoeleh F, Asadpour M (2017) Transfer learning through graph-based skill acquisition. Workshop on Transfer in Reinforcement Learning (TiRL)
Simsek O (2008) Behavioral building blocks for autonomous agents: description, identification, and learning. PhD thesis, University of Massachusetts Amherst
Soni V, Singh S (2006) Using homomorphisms to transfer options across continuous reinforcement learning domains. In: AAAI, vol 6, pp 494–499
Stolle M, Precup D (2002) Learning options in reinforcement learning. In: International symposium on abstraction, reformulation, and approximation, pp 212–223. Springer
Sutton RSS, Precup D, Singh S (1999) Between MDPs and semi-MDPs: A framework for temporal abstraction in reinforcement learning. Artif Intell 112(1-2):181–211
Article MathSciNet Google Scholar
Taghizadeh N, Beigy H (2013) A novel graphical approach to automatic abstraction in reinforcement learning. Robot Auton Syst 61(8):821–835
Article Google Scholar
Taylor M E, Stone P (2009) Transfer learning for reinforcement learning domains: A survey. J Mach Learn Res 10:1633–1685
MathSciNet MATH Google Scholar
Taylor M E, Stone P (2011) An introduction to intertask transfer for reinforcement learning. AI Mag 32 (1):15
Article Google Scholar
Taylor M, Whiteson S, Stone P (2007) Transfer via inter-task mappings in policy search reinforcement learning. In: Proceedings of the 6th international joint conference on autonomous agents and multiagent systems. http://www.cs.utexas.edu/users/ai-lab/?taylor:ijcaams07
Wang H, Fan S, Song J, Gao Y, Chen X (2014) Reinforcement learning transfer based on subgoal discovery and subtask similarity. IEEE/CAA Journal of Automatica Sinica 1(3):257–266
Article Google Scholar
Zhang Y, Tang B, Jiang M, Wang J, Xu H (2015) Domain adaptation for semantic role labeling of clinical text. Journal of the American Medical Informatics Association, pp 48–56

Download references

Author information

Authors and Affiliations

School of Electrical and Computer Engineering, University of Tehran, Tehran, Iran
Farzaneh Shoeleh & Masoud Asadpour

Authors

Farzaneh Shoeleh
View author publications
You can also search for this author in PubMed Google Scholar
Masoud Asadpour
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Farzaneh Shoeleh.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Shoeleh, F., Asadpour, M. Skill based transfer learning with domain adaptation for continuous reinforcement learning domains. Appl Intell 50, 502–518 (2020). https://doi.org/10.1007/s10489-019-01527-z

Download citation

Published: 30 July 2019
Issue Date: February 2020
DOI: https://doi.org/10.1007/s10489-019-01527-z

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Skill based transfer learning with domain adaptation for continuous reinforcement learning domains

Abstract

Access this article

Similar content being viewed by others

Multi-agent deep reinforcement learning: a survey

Learning from positive and unlabeled data: a survey

Emerging trends in federated learning: from model fusion to federated X learning

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Skill based transfer learning with domain adaptation for continuous reinforcement learning domains

Abstract

Access this article

Similar content being viewed by others

Multi-agent deep reinforcement learning: a survey

Learning from positive and unlabeled data: a survey

Emerging trends in federated learning: from model fusion to federated X learning

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation