Abstract
We generalise the problem of inverse reinforcement learning to multiple tasks, from multiple demonstrations. Each one may represent one expert trying to solve a different task, or as different experts trying to solve the same task. Our main contribution is to formalise the problem as statistical preference elicitation, via a number of structured priors, whose form captures our biases about the relatedness of different tasks or expert policies. In doing so, we introduce a prior on policy optimality, which is more natural to specify. We show that our framework allows us not only to learn to efficiently from multiple experts but to also effectively differentiate between the goals of each. Possible applications include analysing the intrinsic motivations of subjects in behavioural experiments and learning from multiple teachers.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Abbeel, P., Ng, A.Y.: Apprenticeship learning via inverse reinforcement learning. In: ICML 2004 (2004)
Babes, M., Marivate, V., Littman, M., Subramanian, K.: Apprenticeship learning about multiple intentions. In: ICML 2011 (2011)
Birlutiu, A., Groot, P., Heskes, T.: Multi-task preference learning with gaussian processes. In: ESANN 2009, pp. 123–128 (2009)
Boutilier, C.: A POMDP formulation of preference elicitation problems. In: AAAI 2002, pp. 239–246 (2002)
Choi, J., Kim, K.-E.: Inverse reinforcement learning in partially observable environments. Journal of Machine Learning Research 12, 691–730 (2011)
Chu, W., Ghahramani, Z.: Preference learning with Gaussian processes. In: ICML 2005 (2005)
Coates, A., Abbeel, P., Ng, A.Y.: Learning for control from multiple demonstrations. In: ICML 2008, pp. 144–151. ACM (2008)
Dearden, R., Friedman, N., Russell, S.J.: Bayesian Q-learning. In: AAAI/IAAI, pp. 761–768 (1998)
Dimitrakakis, C.: Robust Bayesian reinforcement learning through tight lower bounds. In: EWRL 2011 (2011)
Doshi-Velez, F., Wingate, D., Roy, N., Tenenbaum, J.: Nonparametric Bayesian policy priors for reinforcement learning. In: NIPS 2010, pp. 532–540 (2010)
Ferguson, T.S.: Prior distributions on spaces of probability measures. The Annals of Statistics 2(4), 615–629 (1974) ISSN 00905364
Geweke, J.: Bayesian inference in econometric models using Monte Carlo integration. Econometrica: Journal of the Econometric Society, 1317–1339 (1989)
Heskes, T.: Solving a huge number of similar tasks: a combination of multi-task learning and a hierarchical Bayesian approach. In: ICML 1998, pp. 233–241. Citeseer (1998)
Lazaric, A., Ghavamzadeh, M.: Bayesian multi-task reinforcement learning. In: ICML 2010 (2010)
Natarajan, S., Kunapuli, G., Judah, K., Tadepalli, P., Kersting, K., Shavlik, J.: Multi-agent inverse reinforcement learning. In: ICMLA 2010, pp. 395–400. IEEE (2010)
Ng, A.Y., Russell, S.: Algorithms for inverse reinforcement learning. In: ICML 2000, pp. 663–670. Morgan Kaufmann (2000)
Puterman, M.L.: Markov Decision Processes: Discrete Stochastic Dynamic Programming. John Wiley & Sons, New Jersey (2005)
Ramachandran, D., Amir, E.: Bayesian inverse reinforcement learning. In: IJCAI 2007, vol. 51, p. 61801 (2007)
Robbins, H.: An empirical Bayes approach to statistics (1955)
Rothkopf, C.A., Dimitrakakis, C.: Preference Elicitation and Inverse Reinforcement Learning. In: Gunopulos, D., Hofmann, T., Malerba, D., Vazirgiannis, M. (eds.) ECML PKDD 2011. LNCS, vol. 6913, pp. 34–48. Springer, Heidelberg (2011)
Syed, U., Schapire, R.E.: A game-theoretic approach to apprenticeship learning. In: NIPS 2008, vol. 10 (2008)
Wilson, A., Fern, A., Ray, S., Tadepalli, P.: Multi-task reinforcement learning: a hierarchical Bayesian approach. In: ICML 2007, pp. 1015–1022. ACM (2007)
Ziebart, B.D., Andrew Bagnell, J., Dey, A.K.: Modelling interaction via the principle of maximum causal entropy. In: ICML 2010, Haifa, Israel (2010)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2012 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Dimitrakakis, C., Rothkopf, C.A. (2012). Bayesian Multitask Inverse Reinforcement Learning. In: Sanner, S., Hutter, M. (eds) Recent Advances in Reinforcement Learning. EWRL 2011. Lecture Notes in Computer Science(), vol 7188. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-29946-9_27
Download citation
DOI: https://doi.org/10.1007/978-3-642-29946-9_27
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-29945-2
Online ISBN: 978-3-642-29946-9
eBook Packages: Computer ScienceComputer Science (R0)