Skip to main content
Log in

The Relations Among Potentials, Perturbation Analysis, and Markov Decision Processes

  • Published:
Discrete Event Dynamic Systems Aims and scope Submit manuscript

Abstract

This paper provides an introductory discussion for an important concept, the performance potentials of Markov processes, and its relations with perturbation analysis (PA), average-cost Markov decision processes (MDP), Poisson equations, α-potentials, the fundamental matrix, and the group inverse of the transition matrix (or the infinitesimal generators). Applications to single sample path-based performance sensitivity estimation and performance optimization are also discussed. On-line algorithms for performance sensitivity estimates and on-line schemes for policy iteration methods are presented. The approach is closely related to reinforcement learning algorithms.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Similar content being viewed by others

References

  • Berman, A., and Plemmons, R. J. 1994. Nonnegative Matrices in the Mathematical Sciences. Philadelphia: SIAM.

    Google Scholar 

  • Bertsekas, D. P. 1995. Dynamic Programming and Optimal Control, Vols. I, II. Belmont, Massachusetts: Athena Scientific.

    Google Scholar 

  • Bertsekas, D. P., and Tsitsiklis, J. N. 1996. Neuro–Dynamic Programming. Belmont, Massachusetts: Athena Scientific.

    Google Scholar 

  • Cao, X. R. 1994. Realization Probabilities: The Dynamics of Queueing Systems. New York: Springer–Verlag.

    Google Scholar 

  • Cao, X. R., and Chen, H. F. 1997. Potentials, perturbation realization, and sensitivity analysis of Markov processes. IEEE Trans. on Automatic Control 42: 1382–1393.

    Google Scholar 

  • Cao, X. R., and Wan, Y. W. To appear. Algorithms for sensitivity analysis of Markov systems through potentials and perturbation realization. IEEE Trans. on Control Systems Technology.

  • Çinlar, E. 1975. Introduction to Stochastic Processes. Prentice Hall, Inc.

  • Ho, Y. C., and Cao, X. R. 1991. Perturbation Analysis of Discrete–Event Dynamic Systems. Boston: Kluwer Academic Publisher.

    Google Scholar 

  • Dai, L. Y. 1994. A consistent algorithm for derivative estimation of Markov chains. Proceedings of the 33rd IEEE Conference on Decision and Control, 1990–1995.

  • Dai, L. Y., and Ho, Y. C. 1995. Structural infinitesimal perturbation analysis (SIPA) for derivative estimation of discrete event dynamic systems. IEEE Transactions on AC 40: 1154–1166.

    Google Scholar 

  • Fu, M., and Hu, J. Q. 1994. Smoothed perturbation analysis derivative estimation for Markov chains. Operations Research Letters 14: 241–251.

    Google Scholar 

  • Gallager, R. G. 1995. Discrete Stochastic Processes. Kluwer Academic Publishers.

  • Golub, G. H., and Meyer, C. D., Jr. 1986. Using the QR factorization and group inversion to compute, differentiate, and estimate the sensitivity of stationary probability for Markov chains. SIAM J. Alg. Disc. Meth. 7: 273–281.

    Google Scholar 

  • Jaakkola, T., Singh, S. P., and Jordan, M. J. 1995. Reinforcement learning algorithm for partially observable Markov decision problems. Neural Information Processing Systems7.

  • Kemeny, J. G., and Snell, J. L. 1960. Finite Markov Chains. New York: Van Nostrand.

    Google Scholar 

  • Meyer, Carl D., Jr. 1975. The role of the group generalized inverse in the theory of finite Markov chains. SIAM Review 17: 443–464.

    Google Scholar 

  • Puterman, M. L. 1994. Markov Decision Processes: Discrete Stochastic Dynamic Programming. New York: Wiley.

    Google Scholar 

  • Ross, S. M. 1983. Introduction to Stochastic Dynamic Programming. New York: Academic Press, Inc.

    Google Scholar 

  • Tsitsiklis, J. N., and Van Roy, B. 1996. Feature–based methods for large scale dynamic programming. Machine Learning 22: 59–94.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

About this article

Cite this article

Cao, XR. The Relations Among Potentials, Perturbation Analysis, and Markov Decision Processes. Discrete Event Dynamic Systems 8, 71–87 (1998). https://doi.org/10.1023/A:1008260528575

Download citation

  • Issue Date:

  • DOI: https://doi.org/10.1023/A:1008260528575

Navigation