Neuro-Dynamic Programming: An Overview and Recent Results

Bertsekas, Dimitri P.

doi:10.1007/978-3-540-69995-8_11

Dimitri P. Bertsekas²

Part of the book series: Operations Research Proceedings ((ORP,volume 2006))

2497 Accesses

Abstract

Neuro-dynamic programming is a methodology for sequential decision making under uncertainty, which is based on dynamic programming. The key idea is to use a scoring function to select decisions in complex dynamic systems, arising in a broad variety of applications from engineering design, operations research, resource allocation, finance, etc. This is much like what is done in computer chess, where positions are evaluated by means of a scoring function and the move that leads to the position with the best score is chosen. Neuro-dynamic programming provides a class of systematic methods for computing appropriate scoring functions using approximation schemes and simulation/evaluation of the system’s performance.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 169.00; Price excludes VAT (USA)

Softcover Book: USD 219.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Bertsekas, D. P., and Ioffe, S. (1996) “Temporal Differences-Based Policy Iteration and Applications in Neuro-Dynamic Programming,” Lab. for Info. and Decision Systems Report LIDS-P-2349, Massachusetts Institute of Technology.
Google Scholar
Nedić, A. and Bertsekas, D. P. (2003) “Least-Squares Policy Evaluation Algorithms with Linear Function Approximation,” J. of Discrete Event Systems, Vol. 13, pp. 79–110.
Article Google Scholar
Bertsekas, D. P., Borkar, V., and Nedić, A. (2004) “Improved Temporal Difference Methods with Linear Function Approximation,” in Learning and Approximate Dynamic Programming, by J. Si, A. Barto, W. Powell, (Eds.), IEEE Press, N. Y.
Google Scholar
Yu, H., and Bertsekas, D. P. (2006) “Convergence Results for Some Temporal Difference Methods Based on Least Squares,” Lab. for Information and Decision Systems Report 2697, MIT.
Google Scholar
Dimitri Bertsekas, Dynamic Programming and Optimal Control, Vol. II, 3rd Edition, Athena Scientific, Belmont, MA, Dec. 2006.
Google Scholar

Download references

Author information

Authors and Affiliations

Massachusetts Institute of Technology, Cambridge, USA
Dimitri P. Bertsekas

Authors

Dimitri P. Bertsekas
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Institut für Wirtschaftstheorie und Operations Research, Universität Karlsruhe (TH), Kaiserstraße 12, 76131, Karlsruhe, Germany
Karl-Heinz Waldmann & Ulrike M. Stocker &

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Bertsekas, D.P. (2007). Neuro-Dynamic Programming: An Overview and Recent Results. In: Waldmann, KH., Stocker, U.M. (eds) Operations Research Proceedings 2006. Operations Research Proceedings, vol 2006. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-69995-8_11

Download citation

DOI: https://doi.org/10.1007/978-3-540-69995-8_11
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-69994-1
Online ISBN: 978-3-540-69995-8
eBook Packages: Business and EconomicsBusiness and Management (R0)

Publish with us

Policies and ethics