Improved iterative computation of the expected discounted return in Markov and semi-Markov chains

Porteus, E. L.

doi:10.1007/BF01919243

Improved iterative computation of the expected discounted return in Markov and semi-Markov chains

Published: September 1980

Volume 24, pages 155–170, (1980)
Cite this article

Zeitschrift für Operations Research Aims and scope Submit manuscript

E. L. Porteus¹

62 Accesses
4 Citations
Explore all metrics

Abstract

This paper seeks to reduce the computation needed by iterative methods to find the expected discounted return in a finite semi-Markov or Markov chain. Two new norm reducing extrapolations and a new iterative method are presented and shown to be convergent. Their application is illustrated on several 100 row problems. One of the extrapolations, the row sum extrapolation, appears to be promising.

Zusammenfassung

Die Arbeit zeigt, wie der Rechenaufwand iterativer Methoden zur Bestimmung des erwarteten diskontierten Nutzens einer endlichen Markowschen oder Semi-Markowschen Kette reduziert werden kann. Zwei neue normreduzierende Extrapolationen und eine neue iterative Methode werden dargestellt, und ihre Konvergenz wird gezeigt. Die Resultate mehrerer numerischer Testbeispiele werden angegeben.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Distributionally robust stochastic programs with side information based on trimmings

Article Open access 22 November 2021

Data-driven distributionally robust optimization using the Wasserstein metric: performance guarantees and tractable reformulations

Article Open access 07 July 2017

Regularized distributionally robust optimization with application to the index tracking problem

Article 27 February 2024

References

Denardo, E.: Contraction mappings in the theory underlying dynamic programming, SIAM Review9, 1967, 165–177.
Article Google Scholar
Doob, J.: Stochastic Processes. New York 1953.
Federgruen, A., andP. Schweitzer: A survey of asymptotic value-iteration for undiscounted Markovian decision processes. To appear in: Ed. by D. White. Proc. Intl. Conf. on Markov Decision Processes. Manchester 1979.
Federgruen, A., P. Schweitzer, andH. Tijms: Contraction mappings underlying undiscounted Markov decision problems. J. Math. Anal. Appl.65, 1978, 711–730.
Article Google Scholar
Fox, L.: Finite-difference methods for elliptic boundary-value problems. The State of the Art in Numerical Analysis. Ed. by D. Jacobs. London-New York 1977, 799–881.
Hajnal, J.: The ergodic properties of nonhomogeneous Markov chains, Proc. Cambridge Phil. Soc.52, 1956, 67–77.
Google Scholar
—: Weak ergodicity in nonhomogeneous Markov chains. Proc. Cambridge Phil. Soc.54, 1958, 233–246.
Google Scholar
van Hee, K., A. Hordijk, andJ. van der Wahl: Successive approximations for convergent dynamic programming. Ed. by H. Tijms and J. Wessels. Markov Decision Theory. Math. Centre Tract 93, Amsterdam 1977, 183–212.
Hinderer, K.: Estimates for finite-stage dynamic programs. J. Math. Anal. Appl.55, 1976, 207–238.
Article Google Scholar
-: On approximate solutions of finite stage dynamic programs. Ed. by M. Puterman. Dynamic Programming and Its Application, New York 1978.
Hinderer, K., andG. Hübner: On exact and approximate solutions of unstructured finite-stage dynamic programs. Ed. by H. Tijms and J. Wessels. Markov Decision Theory. Math. Centre Tract 93, Amsterdam 1977, 57–76.
Hitchcock, D., andJ. MacQueen: On computing the expected discounted return in a Markov chain. Naval Res. Logist. Quart.17, 1970, 237–241.
Google Scholar
Howard, R.: Dynamic Programming and Markov Processes. New York 1960.
Hübner, G.: Contraction properties of Markov decision models with applications to the elimination of non-optimal actions. Dynamische Optimierung, Bonner Math. Schriften98, 1977, 57–65.
Google Scholar
Jewell, W.: Markov-renewal programming. I: Formulation, finite return models. Opns. Res.11, 1963, 938–948.
Google Scholar
MacQueen, J.: A modified dynamic programming method for Markovian decision problems. J. Math. Anal. Appl.14, 1966, 38–43.
Article Google Scholar
McDowell, L.: Variable Successive Over-Relaxation. Report Nr. 244, Department of Computer Science, University of Illinois, Urbana, Illinois, 1967.
Google Scholar
Morton, T.: On the asymptotic convergence rate of cost difference for Markovian decision processes. Opns. Res.19, 1971a, 244–248.
Google Scholar
—: Undiscounted Markov renewal programming via modified successive approximations. Opns. Res.19, 1971b, 1081–1089.
Google Scholar
Morton, T., andW. Wecker: Discounting, ergodicity, and convergence for Markov decision provesses. Man. Sci.23, 1977, 890–900.
Google Scholar
van Nunen, J.: A set of successive approximation methods for discounted Markovian decision problems. Z. Opns. Res.20, 1976a, 203–208.
Google Scholar
-: Contracting Markov Decision Processes. Math. Centre Tract 71, Amsterdam 1976a.
Porteus, E.: Some bounds for discounted sequential decision processes. Man. Sci.18, 1971, 7–11.
Google Scholar
—: Bounds and transformations for finite Markov decision chains, Opns. Res.23, 1975, 761–784.
Google Scholar
-: Overview of iterative methods for discounted finite Markov and semi-markov decision chains. To appear. Ed. by D. White. Intl. Conf. on Markov Decision Processes. Manchester 1979a.
-: Improved iterative computation of the expected discounted return in Markov and semi-Markov chains. Research Paper 443 Rev., Graduate School of Business, Stanford University, 1979b.
Porteus, E., andJ. Totten: An experiment in computing the expected discounted return in a finite Markov chain. Research Paper 250, Graduate School of Business, Stanford University, 1975.
—: Accelerated computation of the expected discounted return in a Markov chain. Opns. Res.26, 1978, 350–358.
Google Scholar
Puterman, M., andM. Shin: Modified policy iteration algorithms for discounted Markov decision problems24, 1978, 1127–1137.
Google Scholar
Reetz, D.: Decision exclusion algorithm for a class of Markovian decision processes. Z. Opns. Res.20, 1976, 125–131.
Google Scholar
Reid, J.: Sparse matrices. Ed. by D. Jacobs. The State of the Art in Numerical Analysis. New York 1977, 85–148.
Rothblum, U.: Iterated successive approximation for sequential decision processes. New Haven 1979.
Schweitzer, P.: Iterative solution of the functional equations of undiscounted Markov renewal programming. J. Math. Anal. Appl.34, 1971, 495–501.
Article Google Scholar
Schweitzer, P., andA. Federgruen: Geometric convergence of value-iteration in multichain Markov renewal programming. Adv. Appl. Prob.11, 1979, 188–217.
Google Scholar
Settari, A., andK. Aziz: A generalization of the additive correction methods for the iterative solution of matrix equations. SIAM J. Numer. Anal.10, 1973, 506–521.
Article Google Scholar
Varga, R.: Matrix Iterative Analysis. Englewood Cliffs 1962.
Verkhovsky, B.: Smoothing system design and parametric Markovian programming. Ed. by H. Tijms and J. Wessels. Markov Decision Theory. Math. Centre Tract 93, Amsterdam 1977, 105–117.
van der Wal, J.: A successive approximation algorithm for an undiscounted Markov decision process, Computing17, 1976, 157–162.
Google Scholar
—: Discounted Markov games: generalized policy iteration method. J. Optzn. Th. Appl.25, 1978, 125–138.
Article Google Scholar
White, D.: Dynamic programming, Markov chains and the method of successive approximations. J. Math. Anal. Appl.6, 1963, 373–376.
Article Google Scholar
Young, D.: Iterative Solution of Large Linear Systems. New York 1971.
Young, D., andR. Gregory: A Survey of Numerical Mathematics. Reading 1973.

Download references

Author information

Authors and Affiliations

Graduate School of Business, Stanford University, 94305, Stanford, CA, USA
E. L. Porteus (Associate Professor of Management Science)

Authors

E. L. Porteus
View author publications
You can also search for this author in PubMed Google Scholar

Rights and permissions

Reprints and permissions

About this article

Cite this article

Porteus, E.L. Improved iterative computation of the expected discounted return in Markov and semi-Markov chains. Zeitschrift für Operations Research 24, 155–170 (1980). https://doi.org/10.1007/BF01919243

Download citation

Received: 15 April 1978
Revised: 15 April 1980
Issue Date: September 1980
DOI: https://doi.org/10.1007/BF01919243

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Improved iterative computation of the expected discounted return in Markov and semi-Markov chains

Abstract

Zusammenfassung

Access this article

Similar content being viewed by others

Distributionally robust stochastic programs with side information based on trimmings

Data-driven distributionally robust optimization using the Wasserstein metric: performance guarantees and tractable reformulations

Regularized distributionally robust optimization with application to the index tracking problem

References

Author information

Authors and Affiliations

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Improved iterative computation of the expected discounted return in Markov and semi-Markov chains

Abstract

Zusammenfassung

Access this article

Similar content being viewed by others

Distributionally robust stochastic programs with side information based on trimmings

Data-driven distributionally robust optimization using the Wasserstein metric: performance guarantees and tractable reformulations

Regularized distributionally robust optimization with application to the index tracking problem

References

Author information

Authors and Affiliations

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation