Sequential Decision Making Under Uncertainty Using Ordinal Preferential Information

Gilbert, Hugo

doi:10.1007/978-3-319-23114-3_36

Hugo Gilbert⁵

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 9346))

Included in the following conference series:

International Conference on Algorithmic Decision Theory

Abstract

The research work undertaken in my thesis aims at facilitating the conception of autonomous agents able to solve complex problems in sequential decision problems (e.g., planning problems in robotics).

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Abbeel, P., Ng, A.: Apprenticeship Learning via Inverse Reinforcement Learning. In: Proceedings of Twenty-first International Conference on Machine Learning. ICML 2004, ACM, New York, NY, USA (2004). http://doi.acm.org/10.1145/1015330.1015430
Akrour, R., Schoenauer, M., Sebag, M.: APRIL: Active preference-learning based reinforcement learning. In: CoRR (2012). http://arxiv.org/abs/1208.0984
Bain, M., Sammut, C.: A Framework for Behavioural Cloning. In: Machine Intelligence vol. 15, pp. 103–129. Oxford University Press (1996)
Google Scholar
Busa-fekete, R., Sznyi, B., Weng, P., Cheng, W., Hullermeier, E.: Preference-based Evolutionary Direct Policy Search (2014)
Google Scholar
Cheng, W., Fürnkranz, J., Hüllermeier, E., Park, S.-H.: Preference-based policy iteration: leveraging preference learning for reinforcement learning. In: Gunopulos, D., Hofmann, T., Malerba, D., Vazirgiannis, M. (eds.) ECML PKDD 2011, Part I. LNCS, vol. 6911, pp. 312–327. Springer, Heidelberg (2011)
Chapter Google Scholar
Fishburn, P., LaValle, I.: A nonlinear, nontransitive and additive-probability model for decisions under uncertainty. Ann. Statist. 15(2), 830–844 (1987)
Article MathSciNet Google Scholar
Gilbert, H., Spanjaard, O., Viappiani, P., Weng, P.: Reducing the number of queries in interactive value iteration. In: ADT (2015)
Google Scholar
Gilbert, H., Spanjaard, O., Viappiani, P., Weng, P.: Solving MDPs with skew symmetric bilinear utility functions. In: IJCAI (2015)
Google Scholar
Givan, R., Leach, S., Dean, T.: Bounded-parameter Markov decision processes. Artif. Intell. 122(1–2), 71–109 (2000)
Article MathSciNet Google Scholar
Hüllermeier, E., Schlegel, P.: Preference-Based CBR: first steps toward a methodological framework. In: Ram, A., Wiratunga, N. (eds.) ICCBR 2011. LNCS, vol. 6880, pp. 77–91. Springer, Heidelberg (2011)
Chapter Google Scholar
Puterman, M.: Markov Decision Processes: Discrete Stochastic Dynamic Programming, 1st edn. John Wiley & Sons Inc, New York (1994)
Book Google Scholar
Rostek, M.: Quantile Maximization in Decision Theory. Rev. Econ. Stud. 77(1), 339–371 (2010)
Article MathSciNet Google Scholar
Sutton, R., Barto, A.: Reinforcement learning: An introduction, vol. 116. Cambridge University Press, Cambridge (1998)
MATH Google Scholar
Weng, P.: Ordinal decision models for Markov decision processes. In: ECAI 2012–20th European Conference on Artificial Intelligence. Including Prestigious Applications of Artificial Intelligence (PAIS-2012) System Demonstrations Track, Montpellier, France, 27–31 August, 2012. pp. 828–833 (2012)
Google Scholar
Weng, P., Busa-Fekete, R., Hüllermeier, E.: Interactive Q-learning with ordinal rewards and unreliable tutor. In: ECML/PKDD Workshop Reinforcement Learning with Generalized Feedback (September 2013). http://www-desir.lip6.fr/weng/pub/ecml2013-ws.pdf
Weng, P., Zanuttini, B.: Interactive Value Iteration for Markov Decision Processes with Unknown Rewards. In: Rossi, F. (ed.) IJCAI. IJCAI/AAAI (2013)
Google Scholar
Yu, S., Lin, Y., Yan, P.: Optimization models for the first arrival target distribution function in discrete time. J. Math. Anal. Appl. 225(1), 193–223 (1998). http://www.sciencedirect.com/science/article/pii/S0022247X98960152
Article MathSciNet Google Scholar
Yue, Y., Broder, J., Kleinberg, R., Joachims, T.: The K-armed Dueling Bandits Problem. Journal of Computer and System Sciences (2012). (in press)
Google Scholar

Download references

Author information

Authors and Affiliations

Sorbonne Universités, UPMC Univ Paris 06, CNRS, LIP6 UMR 7606, Paris, France
Hugo Gilbert

Authors

Hugo Gilbert
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Hugo Gilbert .

Editor information

Editors and Affiliations

University of New South Wales, Kensington, Australia
Toby Walsh

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Gilbert, H. (2015). Sequential Decision Making Under Uncertainty Using Ordinal Preferential Information. In: Walsh, T. (eds) Algorithmic Decision Theory. ADT 2015. Lecture Notes in Computer Science(), vol 9346. Springer, Cham. https://doi.org/10.1007/978-3-319-23114-3_36

Download citation

DOI: https://doi.org/10.1007/978-3-319-23114-3_36
Published: 28 August 2015
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-23113-6
Online ISBN: 978-3-319-23114-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics