Value Function Approximation

Lagoudakis, Michail G.

doi:10.1007/978-1-4899-7687-1_876

Michail G. Lagoudakis³

1372 Accesses

Abstract

The goal in sequential decision making under uncertainty is to find good or optimal policies for selecting actions in stochastic environments in order to achieve a long-term goal; such problems are typically modeled as Markov decision processes (MDPs). A key concept in MDPs is the value function, a real-valued function that summarizes the long-term goodness of a decision into a single number and allows the formulation of optimal decision making as an optimization problem. An exact representation of value functions in large real-world problems is infeasible; therefore, a large body of research has been devoted to value-function approximation methods, which sacrifice some representation accuracy for the sake of scalability. These approaches have delivered effective approaches to deriving good policies in hard decision problems and laid the foundation for efficient reinforcement learning algorithms, which learn good policies in unknown stochastic environments through interaction.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 999.99; Price excludes VAT (USA)

Hardcover Book: USD 999.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Robust Markov Decision Processes: A Place Where AI and Formal Methods Meet

Reinforcement Learning for Control Using Value Function Approximation

Approximating Euclidean by Imprecise Markov Decision Processes

Recommended Reading

Bertsekas DP, Tsitsiklis JN (1996) Neuro-dynamic programming. Athena Scientific, Belmont
MATH Google Scholar
Bethke B, How JP (2009) Approximate dynamic programming using Bellman residual elimination and Gaussian process regression. In: Proceedings of the American control conference, St. Louis, pp 745–750
Google Scholar
Bethke B, How JP, Ozdaglar A (2008) Approximate dynamic programming using support vector regression. In: Proceedings of the IEEE conference on decision and control, Cancun, pp 745–750
Google Scholar
Bradtke SJ, Barto AG (1996) Linear least-squares algorithms for temporal difference learning. Mach Learn 22(1–3):33–57
MATH Google Scholar
Buşoniu L, Babuška R, Schutter BD, Ernst D (2010) Reinforcement learning and dynamic programming using functions approximators. CRC, Boca Raton
MATH Google Scholar
de Farias DP, Van Roy B (2003) The linear programming approach to approximate dynamic programming. Oper Res 51(6):850–865
Article MathSciNet MATH Google Scholar
Engel Y, Mannor S, Meir R (2003) Bayes meets Bellman: the Gaussian process approach to temporal difference learning. In: Proceedings of the international conference on machine learning (ICML), Washington, DC, pp 154–161
Google Scholar
Engel Y, Mannor S, Meir R (2005) Reinforcement learning with Gaussian processes. In: Proceedings of the international conference on machine learning (ICML), Bonn, pp 201–208
Google Scholar
Ernst D, Geurts P, Wehenkel L (2005) Tree-based batch mode reinforcement learning. J Mach Learn Res 6:503–556
MathSciNet MATH Google Scholar
Johns J, Petrik M, Mahadevan S (2009) Hybrid least-squares algorithms for approximate policy evaluation. Mach Learn 76(2–3):243–256
Article Google Scholar
Lagoudakis MG, Parr R (2003) Least-squares policy iteration. J Mach Learn Res 4:1107–1149
MathSciNet MATH Google Scholar
Mahadevan S, Maggioni M (2007) Proto-value functions: a Laplacian framework for learning representation and control in Markov decision processes. J Mach Learn Res 8:2169–2231
MathSciNet MATH Google Scholar
Menache I, Mannor S, Shimkin N (2005) Basis function adaptation in temporal difference reinforcement learning. Ann Oper Res 134(1):215–238
Article MathSciNet MATH Google Scholar
Nedić A, Bertsekas DP (2003) Least-squares policy evaluation algorithms with linear function approximation. Discret Event Dyn Syst Theory Appl 13(1–2):79–110
MathSciNet MATH Google Scholar
Parr R, Painter-Wakefield C, Li L, Littman M (2007) Analyzing feature generation for value-function approximation. In: Proceedings of the international conference on machine learning (ICML), Corvallis, pp 449–456
Google Scholar
Puterman ML (1994) Markov decision processes: discrete stochastic dynamic programming. Wiley, New York
Book MATH Google Scholar
Rasmussen CE, Kuss M (2004) Gaussian processes in reinforcement learning. In: Thrun S, Saul LK, Scholkopf B (eds) Advances in neural information processing systems (NIPS). MIT Press, Cambridge pp 751–759
Google Scholar
Sutton R, Barto A (1998) Reinforcement learning: an introduction. MIT, Cambridge
Google Scholar
Taylor G, Parr R (2009) Kernelized value function approximation for reinforcement learning. In: Proceedings of the international conference on machine learning (ICML), Toronto, pp 1017–1024
Google Scholar
Xu X, Hu D, Lu X (2007) Kernel-based least-squares policy iteration for reinforcement learning. IEEE Trans Neural Netw 18(4):973–992
Article Google Scholar

Download references

Author information

Authors and Affiliations

Technical University of Crete, Chania, Greece
Michail G. Lagoudakis

Authors

Michail G. Lagoudakis
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Michail G. Lagoudakis .

Editor information

Editors and Affiliations

The University of New South Wales, Sydney, NSW, Australia
Claude Sammut
Faculty of Information Technology, Monash University, Melbourne, VIC, Australia
Geoffrey I. Webb

Rights and permissions

Reprints and permissions

Copyright information

About this entry

Cite this entry

Lagoudakis, M.G. (2017). Value Function Approximation. In: Sammut, C., Webb, G.I. (eds) Encyclopedia of Machine Learning and Data Mining. Springer, Boston, MA. https://doi.org/10.1007/978-1-4899-7687-1_876

Download citation

DOI: https://doi.org/10.1007/978-1-4899-7687-1_876
Published: 14 April 2017
Publisher Name: Springer, Boston, MA
Print ISBN: 978-1-4899-7685-7
Online ISBN: 978-1-4899-7687-1
eBook Packages: Computer ScienceReference Module Computer Science and Engineering

Publish with us

Policies and ethics