Skip to main content
Log in

Automated Transfer for Reinforcement Learning Tasks

  • Discussion
  • Published:
KI - Künstliche Intelligenz Aims and scope Submit manuscript

Abstract

Reinforcement learning applications are hampered by the tabula rasa approach taken by existing techniques. Transfer for reinforcement learning tackles this problem by enabling the reuse of previously learned behaviours. To be fully autonomous a transfer agent has to: (1) automatically choose a relevant source task(s) for a given target, (2) learn about the relation between the tasks, and (3) effectively and efficiently transfer between tasks. Currently, most transfer frameworks require substantial human intervention in at least one of the previous three steps. This discussion paper aims at: (1) positioning various knowledge re-use algorithms as forms of transfer, and (2) arguing the validity and possibility of autonomous transfer by detailing potential solutions to the above three steps.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3

Similar content being viewed by others

Explore related subjects

Discover the latest articles, news and stories from top researchers in related subjects.

Notes

  1. In policy iteration algorithms, for example, the policy space can be defined as the space of all possible policies that can be learnt. In other words, this space can be defined by a combination of basis functions and parameterisations spanning different policies.

  2. Such a setting is typical in continuous reinforcement learning. The reasons relate to: (1) Q-function, and (2) state and action space representations.

  3. A typical criterion used is to maximise the expected value of the total discounted pay-off signal.

  4. \({\mathfrak{X}_{\hbox {transfer}}}\) can either be: (1) hand-coded (see [13]), or (2) learned through source and target samples [2].

  5. Typically, n 2 < < n 1 where only few transitions are available from the target task.

References

  1. Abbeel P, Ng AY (2004) Apprenticeship learning via inverse reinforcement learning. In: Proceedings of the 21st international conference on Machine learning, ICML ’04, ACM, New York, NY, USA

  2. Ammar HB, Taylor ME, Tuyls K, Driessens K, Weiss G (2012) Reinforcement learning transfer via sparse coding (full paper). In: Proceedings of the 11th conference on Autonomous Agents and Multiagent Systems (AAMAS), Valencia

  3. Argall BD, Chernova S, Veloso M, Browning B (2009) A survey of robot learning from demonstration. Robot Auton Syst, 57(5):469–483

    Article  Google Scholar 

  4. Buşoniu L, Babuška R, De Schutter B, Ernst D (2010) Reinforcement learning and dynamic programming using function approximators. CRC Press, Boca Raton

  5. Castro PS, Precup D (2010) Using bisimulation for policy transfer in mdps. In: Proceedings of the 9th International Conference on Autonomous Agents and Multiagent Systems: Volume 1, AAMAS ’10, International Foundation for Autonomous Agents and Multiagent Systems, Richland, pp 1399–1400

  6. Ferns N, Panangaden P, Precup D (2004) Metrics for finite markov decision processes. In: Chickering DM, Halpern JY, (eds), UAI, AUAI Press pp 162–169

  7. Ferns N, Panangaden P, Precup D (2011) Bisimulation metrics for continuous markov decision processes. SIAM J Comput, 40(6):1662–1714

    Article  MATH  MathSciNet  Google Scholar 

  8. Knox WB, Stone P, Breazeal C (2013) Teaching agents with human feedback: a demonstration of the tamer framework. In: IUI Companion, pp 65–66

  9. Lee H, Battle A, Raina R, Ng AY (2007) Efficient sparse coding algorithms. In: In NIPS, NIPS pp 801–808

  10. Ng AY, Harada D, Russell S (1999) Policy invariance under reward transformations: theory and application to reward shaping. In: In Proceedings of the 16th International Conference on Machine Learning, Morgan Kaufmann, pp 278–287

  11. Snelson E, Ghahramani Z (2006) Sparse gaussian processes using pseudo-inputs. In: Advances in Neural Information Processing Systems, MIT press, pp 1257–1264

  12. Taylor ME, Stone P (2009) Transfer learning for reinforcement learning domains: a survey. J Mach Learn Res, 10:1633–1685

    MATH  MathSciNet  Google Scholar 

  13. Taylor ME, Stone P, Liu Y (2007) Transfer learning via inter-task mappings for temporal difference learning. J Mach Learn Res 8(1):2125–2167

    MATH  MathSciNet  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Haitham Bou Ammar.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Bou Ammar, H., Chen, S., Tuyls, K. et al. Automated Transfer for Reinforcement Learning Tasks. Künstl Intell 28, 7–14 (2014). https://doi.org/10.1007/s13218-013-0286-8

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s13218-013-0286-8

Keywords