Symmetric approximate linear programming for factored MDPs with application to constrained problems

Dolgov, Dmitri A.; Durfee, Edmund H.

doi:10.1007/s10472-006-9038-x

Symmetric approximate linear programming for factored MDPs with application to constrained problems

Published: 25 January 2007

Volume 47, pages 273–293, (2006)
Cite this article

Annals of Mathematics and Artificial Intelligence Aims and scope Submit manuscript

Dmitri A. Dolgov¹ &
Edmund H. Durfee²

107 Accesses
4 Citations
Explore all metrics

Abstract

A weakness of classical Markov decision processes (MDPs) is that they scale very poorly due to the flat state-space representation. Factored MDPs address this representational problem by exploiting problem structure to specify the transition and reward functions of an MDP in a compact manner. However, in general, solutions to factored MDPs do not retain the structure and compactness of the problem representation, forcing approximate solutions, with approximate linear programming (ALP) emerging as a promising MDP-approximation technique. To date, most ALP work has focused on the primal-LP formulation, while the dual LP, which forms the basis for solving constrained Markov problems, has received much less attention. We show that a straightforward linear approximation of the dual optimization variables is problematic, because some of the required computations cannot be carried out efficiently. Nonetheless, we develop a composite approach that symmetrically approximates the primal and dual optimization variables (effectively approximating both the objective function and the feasible region of the LP), leading to a formulation that is computationally feasible and suitable for solving constrained MDPs. We empirically show that this new ALP formulation also performs well on unconstrained problems.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Existence and Uniqueness of Quasi-stationary Distributions for Symmetric Markov Processes with Tightness Property

Article 17 January 2019

Algorithms for Scheduling Deadline-Sensitive Malleable Tasks

Article 01 April 2024

Computational complexity and algorithms for two scheduling problems under linear constraints

Article 14 April 2024

References

Altman, E.: Constrained Markov decision processes with total cost criteria: occupation measures and primal LP. Methods Models Oper. Res. 43(1), 45–72 (1996)
MATH MathSciNet Google Scholar
Altman, E.: Constrained Markov decision processes with total cost criteria: Lagrange approach and dual LP. Methods Models Oper. Res. 48, 387–417 (1998)
MATH MathSciNet Google Scholar
Altman, E., Shwartz, A.: Adaptive control of constrained Markov chains: criteria and policies. Ann. Oper. Res., special issue on Markov Decision Processes 28, 101–134 (1991)
Article MATH MathSciNet Google Scholar
Altman, E.: Constrained Markov Decision Processes. Chapman & Hall, London, UK (1999)
MATH Google Scholar
Bellman, R.: Adaptive Control Processes: A Guided Tour. Princeton University Press, Princeton, NJ (1961)
MATH Google Scholar
Bertele, U., Brioschi, F.: Nonserial Dynamic Programming. Academic, New York (1972)
MATH Google Scholar
Bertsekas, D.P., Tsitsiklis, J. N.: Neuro-dynamic Programming. Athena Scientific, Belmont, MA (1996)
MATH Google Scholar
Bertsimas, D., Tsitsiklis, J.N.: Introduction to Linear Optimization. Athena Scientific, Belmont, MA (1997)
Google Scholar
Boutilier, C., Dearden, R., Goldszmidt, M.: Exploiting structure in policy construction. In: Proceedings of the Fourteenth International Joint Conference on Artificial Intelligence (IJCAI-95), pp. 1104–1111 (1995)
Boutilier, C., Dearden, R., Goldszmidt, M.: Stochastic dynamic programming with factored representations. Artif. Intell. 121(1,2), 49–107 (2000)
Article MATH MathSciNet Google Scholar
de Farias, D.P., Van Roy, B.: The linear programming approach to approximate dynamic programming. Oper. Res. 51(6), 850–856 (2003)
Article MathSciNet Google Scholar
de Farias, D.P., Van Roy, B.: On constraint sampling in the linear programming approach to approximate dynamic programming. Math. Oper. Res. 29(3), 462– 478 (2004)
Article MATH MathSciNet Google Scholar
Dean, T., Kanazawa, K.: A model for reasoning about persistence and causation. Comput. Intell. 5(3), 142–150 (1989)
Google Scholar
Dolgov, D.A., Durfee, E.H.: Graphical models in local, asymmetric multi-agent Markov decision processes. In: Proceedings of the Third International Joint Conference on Autonomous Agents and Multiagent Systems (AAMAS-04), pp. 956–963 (2004a)
Dolgov, D.A., Durfee, E.H.: Optimal resource allocation and policy formulation in loosely-coupled Markov decision processes. In: Proceedings of the Fourteenth International Conference on Automated Planning and Scheduling (ICAPS-04), pp. 315–324 (2004b)
Guestrin, C., Koller, D., Parr, R., Venkataraman, S.: Efficient solution algorithms for factored MDPs. J. Artif. Intell. Res. 19, 399–468 (2003)
MATH MathSciNet Google Scholar
Guestrin, C.: Planning Under Uncertainty in Complex Structured Environments. Ph.D. thesis, Computer Science Department, Stanford University (2003)
Kallenberg, L.: Linear Programming and Finite Markovian Control Problems. Math. Centrum, Amsterdam, Holland (1983)
Koller, D., Parr, R.: Computing factored value functions for policies in structured MDPs. In: Proceedings of the Sixteenth International Joint Conference on Artificial Intelligence IJCAI-99, pp. 1332–1339 (1999)
Patrascu, R., Poupart, P., Schuurmans, D., Boutilier, C., Guestrin, C.: Greedy linear value-approximation for factored Markov decision processes. In: Eighteenth National Conference on Artificial Intelligence, pp. 285–291. American Association for Artificial Intelligence, Menlo Park, CA (2002)
Google Scholar
Poupart, P., Boutilier, C., Patrascu, R., Schuurmans, D.: Piecewise linear value function approximation for factored MDPs. In: Eighteenth national conference on Artificial Intelligence, pp. 292–299. American Association for Artificial Intelligence, Menlo Park, CA (2002)
Google Scholar
Puterman, M. L.: Markov Decision Processes. Wiley, New York (1994)
MATH Google Scholar
Schuurmans, D., Patrascu, R.: Direct value-approximation for factored MDPs. In: Proceedings of the Fourteenths Neural Information Processing Systems (NIPS) (2001)
Schweitzer, P., Seidmann, A.: Generalized polynomial approximations in Markovian decision processes. J. Math. Anal. Appl. 110, 568–582 (1985)
Article MATH MathSciNet Google Scholar
Sutton, R.S., Barto, A.G.: Reinforcement Learning: An Introduction. MIT Press, Cambridge, MA (1998)
Google Scholar

Download references

Author information

Authors and Affiliations

Technical Research Department (AI & Robotics Group), Toyota Technical Center, 2350 Green Road, Ann Arbor, MI, 48105, USA
Dmitri A. Dolgov
Electrical Engineering and Computer Science, University of Michigan, 2260 Hayward St., Ann Arbor, MI, 48109, USA
Edmund H. Durfee

Authors

Dmitri A. Dolgov
View author publications
You can also search for this author in PubMed Google Scholar
Edmund H. Durfee
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Dmitri A. Dolgov.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Dolgov, D.A., Durfee, E.H. Symmetric approximate linear programming for factored MDPs with application to constrained problems. Ann Math Artif Intell 47, 273–293 (2006). https://doi.org/10.1007/s10472-006-9038-x

Download citation

Received: 02 March 2006
Accepted: 20 June 2006
Published: 25 January 2007
Issue Date: August 2006
DOI: https://doi.org/10.1007/s10472-006-9038-x

Keywords

Mathematics Subject Classifications (2000)

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Symmetric approximate linear programming for factored MDPs with application to constrained problems

Abstract

Access this article

Similar content being viewed by others

Existence and Uniqueness of Quasi-stationary Distributions for Symmetric Markov Processes with Tightness Property

Algorithms for Scheduling Deadline-Sensitive Malleable Tasks

Computational complexity and algorithms for two scheduling problems under linear constraints

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Mathematics Subject Classifications (2000)

Navigation

Symmetric approximate linear programming for factored MDPs with application to constrained problems

Abstract

Access this article

Similar content being viewed by others

Existence and Uniqueness of Quasi-stationary Distributions for Symmetric Markov Processes with Tightness Property

Algorithms for Scheduling Deadline-Sensitive Malleable Tasks

Computational complexity and algorithms for two scheduling problems under linear constraints

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Mathematics Subject Classifications (2000)

Search

Navigation