A further anticycling rule in multichain policy iteration for undiscounted Markov renewal programs

Spreen, D.

doi:10.1007/BF01917174

A further anticycling rule in multichain policy iteration for undiscounted Markov renewal programs

Published: July 1981

Volume 25, pages 225–233, (1981)
Cite this article

Zeitschrift für Operations Research Aims and scope Submit manuscript

D. Spreen¹

27 Accesses
Explore all metrics

Abstract

As is well-known, the convergence of the policy iteration algorithm in multichain Markov renewal programming with no discounting depends on the choice of the relative value vectors during the iteration. We present a choice rule which also guarantees the convergence and is weaker than other known rules. Moreover, the computational complexity of the policy iteration algorithm is smaller if this rule is used.

Zusammenfassung

Wie bekannt ist, hängt die Konvergenz des Politikiterationsalgorithmus für Semi-Markovsche Entscheidungsprozesse ohne Diskontierung und mit mehreren ergodischen Mengen von der Wahl der relativen Werte während der Iteration ab. Wir geben eine Auswahlvorschrift an, die die Konvergenz garantiert und schwächer ist als andere bekannte Vorschriften. Außerdem ist der Rechenaufwand des Politikiterationsalgorithmus bei Benutzung dieser Vorschrift geringer als bei Benutzung der anderen Vorschriften.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Markov Decision Processes with Discounted Rewards: Improved Successive Over-Relaxation Method

On Friedmann’s Subexponential Lower Bound for Zadeh’s Pivot Rule

Markov Decision Processes with Discounted Costs: Improved Successive Over-Relaxation Method

References

Blackwell, D.: Discrete dynamic programming. Ann. Math. Statist.33, 1962, 719–726.
Google Scholar
Denardo, E.: Dynamic Programming: Theory and Application. Englewood Cliffs (forthcoming).
Denardo, E., andB. Fox: Multichain Markov renewal programs. SIAM J. Appl. Math.16, 1968, 468–487.
Google Scholar
Dijkstra, E.W.: A note on two problems in connection with graphs. Numer. Math.1, 1959, 269–271.
Google Scholar
Federgruen, A., andD. Spreen: A new specification of the multichain policy iteration algorithm in undiscounted Markov renewal programs. Management Sci.26, 1980, 1211–1217.
Google Scholar
Howard, R.: Dynamic Programming and Markov Processes. New York 1960.
Jewell, W.: Markov renewal programming. Operations Research11, 1963, 938–972.
Google Scholar
Kemeny, J.G., andJ.L. Snell: Finite Markov Chains. Princeton 1960.
Schweitzer, P.J., andA. Federgruen: Foolproof convergence in multichain policy iteration. J. Math. Anal. Appl.64, 1978, 360–368.
Google Scholar
Stoer, J.: Einführung in die Numerische Mathematik I. 2nd ed. Berlin 1976.

Download references

Author information

Authors and Affiliations

Lehrstuhl für Informatik I, Rheinisch-Westfälische Technische Hochschule Aachen, Büchel 29–31, D-5100, Aachen
D. Spreen

Authors

D. Spreen
View author publications
You can also search for this author in PubMed Google Scholar

Rights and permissions

Reprints and permissions

About this article

Cite this article

Spreen, D. A further anticycling rule in multichain policy iteration for undiscounted Markov renewal programs. Zeitschrift für Operations Research 25, 225–233 (1981). https://doi.org/10.1007/BF01917174

Download citation

Received: 15 January 1980
Revised: 15 April 1981
Issue Date: July 1981
DOI: https://doi.org/10.1007/BF01917174

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A further anticycling rule in multichain policy iteration for undiscounted Markov renewal programs

Abstract

Zusammenfassung

Access this article

Similar content being viewed by others

Markov Decision Processes with Discounted Rewards: Improved Successive Over-Relaxation Method

On Friedmann’s Subexponential Lower Bound for Zadeh’s Pivot Rule

Markov Decision Processes with Discounted Costs: Improved Successive Over-Relaxation Method

References

Author information

Authors and Affiliations

Rights and permissions

About this article

Cite this article

Keywords

Navigation

A further anticycling rule in multichain policy iteration for undiscounted Markov renewal programs

Abstract

Zusammenfassung

Access this article

Similar content being viewed by others

Markov Decision Processes with Discounted Rewards: Improved Successive Over-Relaxation Method

On Friedmann’s Subexponential Lower Bound for Zadeh’s Pivot Rule

Markov Decision Processes with Discounted Costs: Improved Successive Over-Relaxation Method

References

Author information

Authors and Affiliations

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation