Skip to main content

Sequential Decision Making Under Uncertainty Using Ordinal Preferential Information

  • Conference paper
  • First Online:
Algorithmic Decision Theory (ADT 2015)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 9346))

Included in the following conference series:

Abstract

The research work undertaken in my thesis aims at facilitating the conception of autonomous agents able to solve complex problems in sequential decision problems (e.g., planning problems in robotics).

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Abbeel, P., Ng, A.: Apprenticeship Learning via Inverse Reinforcement Learning. In: Proceedings of Twenty-first International Conference on Machine Learning. ICML 2004, ACM, New York, NY, USA (2004). http://doi.acm.org/10.1145/1015330.1015430

  2. Akrour, R., Schoenauer, M., Sebag, M.: APRIL: Active preference-learning based reinforcement learning. In: CoRR (2012). http://arxiv.org/abs/1208.0984

  3. Bain, M., Sammut, C.: A Framework for Behavioural Cloning. In: Machine Intelligence vol. 15, pp. 103–129. Oxford University Press (1996)

    Google Scholar 

  4. Busa-fekete, R., Sznyi, B., Weng, P., Cheng, W., Hullermeier, E.: Preference-based Evolutionary Direct Policy Search (2014)

    Google Scholar 

  5. Cheng, W., Fürnkranz, J., Hüllermeier, E., Park, S.-H.: Preference-based policy iteration: leveraging preference learning for reinforcement learning. In: Gunopulos, D., Hofmann, T., Malerba, D., Vazirgiannis, M. (eds.) ECML PKDD 2011, Part I. LNCS, vol. 6911, pp. 312–327. Springer, Heidelberg (2011)

    Chapter  Google Scholar 

  6. Fishburn, P., LaValle, I.: A nonlinear, nontransitive and additive-probability model for decisions under uncertainty. Ann. Statist. 15(2), 830–844 (1987)

    Article  MathSciNet  Google Scholar 

  7. Gilbert, H., Spanjaard, O., Viappiani, P., Weng, P.: Reducing the number of queries in interactive value iteration. In: ADT (2015)

    Google Scholar 

  8. Gilbert, H., Spanjaard, O., Viappiani, P., Weng, P.: Solving MDPs with skew symmetric bilinear utility functions. In: IJCAI (2015)

    Google Scholar 

  9. Givan, R., Leach, S., Dean, T.: Bounded-parameter Markov decision processes. Artif. Intell. 122(1–2), 71–109 (2000)

    Article  MathSciNet  Google Scholar 

  10. Hüllermeier, E., Schlegel, P.: Preference-Based CBR: first steps toward a methodological framework. In: Ram, A., Wiratunga, N. (eds.) ICCBR 2011. LNCS, vol. 6880, pp. 77–91. Springer, Heidelberg (2011)

    Chapter  Google Scholar 

  11. Puterman, M.: Markov Decision Processes: Discrete Stochastic Dynamic Programming, 1st edn. John Wiley & Sons Inc, New York (1994)

    Book  Google Scholar 

  12. Rostek, M.: Quantile Maximization in Decision Theory. Rev. Econ. Stud. 77(1), 339–371 (2010)

    Article  MathSciNet  Google Scholar 

  13. Sutton, R., Barto, A.: Reinforcement learning: An introduction, vol. 116. Cambridge University Press, Cambridge (1998)

    MATH  Google Scholar 

  14. Weng, P.: Ordinal decision models for Markov decision processes. In: ECAI 2012–20th European Conference on Artificial Intelligence. Including Prestigious Applications of Artificial Intelligence (PAIS-2012) System Demonstrations Track, Montpellier, France, 27–31 August, 2012. pp. 828–833 (2012)

    Google Scholar 

  15. Weng, P., Busa-Fekete, R., Hüllermeier, E.: Interactive Q-learning with ordinal rewards and unreliable tutor. In: ECML/PKDD Workshop Reinforcement Learning with Generalized Feedback (September 2013). http://www-desir.lip6.fr/weng/pub/ecml2013-ws.pdf

  16. Weng, P., Zanuttini, B.: Interactive Value Iteration for Markov Decision Processes with Unknown Rewards. In: Rossi, F. (ed.) IJCAI. IJCAI/AAAI (2013)

    Google Scholar 

  17. Yu, S., Lin, Y., Yan, P.: Optimization models for the first arrival target distribution function in discrete time. J. Math. Anal. Appl. 225(1), 193–223 (1998). http://www.sciencedirect.com/science/article/pii/S0022247X98960152

    Article  MathSciNet  Google Scholar 

  18. Yue, Y., Broder, J., Kleinberg, R., Joachims, T.: The K-armed Dueling Bandits Problem. Journal of Computer and System Sciences (2012). (in press)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Hugo Gilbert .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer International Publishing Switzerland

About this paper

Cite this paper

Gilbert, H. (2015). Sequential Decision Making Under Uncertainty Using Ordinal Preferential Information. In: Walsh, T. (eds) Algorithmic Decision Theory. ADT 2015. Lecture Notes in Computer Science(), vol 9346. Springer, Cham. https://doi.org/10.1007/978-3-319-23114-3_36

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-23114-3_36

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-23113-6

  • Online ISBN: 978-3-319-23114-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics