Skip to main content

Mapping discounted and undiscounted Markov Decision Problems onto Hopfield neural networks

  • Conference paper
  • First Online:
  • 219 Accesses

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 888))

Abstract

This paper presents a framework for mapping the value-iteration and related successive approximation methods for Markov Decision Problems onto Hopfield neural networks, for both discounted and undiscounted versions of the finite state and action spaces. We analyse the asymptotic behaviour of the control sets and we give some estimates on the convergence rate for the value-iteration scheme. We relate the convergence properties on an energy function which represents the key point in mapping Markov Decision Problems onto Hopfield networks. Finally, an application from queueing systems in communication networks is taken into consideration and the results of computer simulation of Hopfield network running for the equivalent Markov Decision Problem are presented, together with some comments on possible developments.

This is a preview of subscription content, log in via an institution.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. J. Bather: Optimal decision procedures for finite Markov chains. Adv. in Appl. Prob. 5, Part I, 328–339 (1973)

    Google Scholar 

  2. R. Belman: Applied Dynamic Programming, Princeton University Press, Princeton, NJ, 1962

    Google Scholar 

  3. D.P. Bertsekas: Constrained Optimization and Lagrange Multiplier Methods. New York: Academic Press 1982

    Google Scholar 

  4. D.P. Bertesekas and J.N.Tsitsiklis: Parallel and Distributed Computation: Numerical Methods. Englewood Cliffs, NJ: Prentice-Hall 1989

    Google Scholar 

  5. D.P. Bertsekas and R. Gallager: Data Networks. Englewood Cliffs, NJ: Prentice-Hall 1992

    Google Scholar 

  6. B. Brown: On the iterative method of dynamic programming on a finite space discrete time Markov process. Ann. Math. Statist 36, 1279–1285 (1965)

    Google Scholar 

  7. E. V. Denardo: Contraction mappings in the theory underlying dynamic programming. SIAM Rev. 9, 165–177 (1967)

    Google Scholar 

  8. E.V. Denardo: A Markov decision problem. In: T.C. Hu, S.M. Robinson (Eds.): Mathematical Programming. Academic Press 1973, pp. 33–68

    Google Scholar 

  9. C. Derman: Finite State Markovian Decision Processes. New York: Academic Press, 1970

    Google Scholar 

  10. B. Finkbeiner, W. Rungaldier: A value iteration algorithm for Markov renewal programming. In: L. Zadeh (ed.): Computing Methods in optimization Problems 2. New York: Academic Press 1969, pp. 95–104

    Google Scholar 

  11. N. Hastings: Optimization of discounted Markov decision problems. Op. Res. Quart. 20, 499–500 (1969)

    Google Scholar 

  12. N. Hastings: Bounds on the gain of a Markov decision process. Op. Res. 19, 240–244 (1971)

    Google Scholar 

  13. N. Hastings, J. Mello: Tests for suboptimal actions in discounted Markov programming. Man. Sci. 19, 1019–1022 (1973)

    Google Scholar 

  14. J. Hertz, A. Krogh, P.G. Palmer: Introduction to the Theory of Neural Computation. Redwood City: Addison-Wesley Publishing Company 1991

    Google Scholar 

  15. R. Howard: Dynamic Programming and Markov Processes, John Wiley, New York, 1960

    Google Scholar 

  16. H. Kushner, A. Kleinman: Accelerated procedures for the solution of discrete Markov control problems. IEEE Trans. Automatic Control AC-16, 147–152 (1971)

    Google Scholar 

  17. V. Lakshmikantham, S. Leela, A.A. Marttynyuk: Stability Analysis of Nonlinear Systems. New York: Marcel Dekker, Inc. 1989

    Google Scholar 

  18. J. MacQueen: A test for suboptimal actions in Markovian decision problems. Op. Res. 15, 559–561 (1967)

    Google Scholar 

  19. T. Morton, W. Wecker: Discounting, ergodicity and convergence for Markov decision processes. Man. Sci. 23, 890–900 (1977)

    Google Scholar 

  20. A. Odoni: On finding the maximal gain for Markov decision processes. Op. Res. 17, 857–860 (1969)

    Google Scholar 

  21. E. Porteus: Bounds and transformations for discounted finite Markov decision chains. Op. Res. 23, 761–784 (1975)

    Google Scholar 

  22. E. Porteus, S.J. Totten: Accelerated computation of the expected discounted return in a Markov chain. Op. Res. 26, 350–358 (1978)

    Google Scholar 

  23. D. Reetz: Solution of a Markovian decision problem by successive overrelaxation. Op. Res. 21, 29–32 (1973)

    Google Scholar 

  24. S.M. Ross: Stochastic Processes. New York: John Wiley & Sons 1983

    Google Scholar 

  25. M. Schwartz: Telecommunication Networks: Protocols, Modeling and Analysis. Reading, MA: Addison-Wesley Publishing Company 1987

    Google Scholar 

  26. P.J. Schweitzer: A turnpike theorem for undiscounted Markovian decision processes. ORSA/TIMS National Meeting, May 1968

    Google Scholar 

  27. P.J. Schweitzer: Iterative solution of the functional equations for undiscounted Markov renewal programming. J.M.A.A. 34, 495–501 (1971)

    Google Scholar 

  28. J. Shapiro: Turnpike planning horizons for a Markovian decision model. Man. Sci. 14, 292–300 (1968)

    Google Scholar 

  29. J. Van Nunen: A set of successive approximation methods for discounted Markovian decision problems. Op. Res. 20, 203–209 (1976)

    Google Scholar 

  30. D. White: Dynamic programming, Markov chains, and the method of successive approximations. J.M.A.A. 6, 373–376 (1963)

    Google Scholar 

  31. W. Zangwill: Nonlinear Programming. A Unified approach. Englewood Cliffs, NJ: Prentice Hall 1969

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Stig I. Andersson

Rights and permissions

Reprints and permissions

Copyright information

© 1995 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Murgu, A. (1995). Mapping discounted and undiscounted Markov Decision Problems onto Hopfield neural networks. In: Andersson, S.I. (eds) Analysis of Dynamical and Cognitive Systems. Lecture Notes in Computer Science, vol 888. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-58843-4_17

Download citation

  • DOI: https://doi.org/10.1007/3-540-58843-4_17

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-58843-6

  • Online ISBN: 978-3-540-49113-2

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics