Skip to main content
Log in

Symmetric approximate linear programming for factored MDPs with application to constrained problems

  • Published:
Annals of Mathematics and Artificial Intelligence Aims and scope Submit manuscript

Abstract

A weakness of classical Markov decision processes (MDPs) is that they scale very poorly due to the flat state-space representation. Factored MDPs address this representational problem by exploiting problem structure to specify the transition and reward functions of an MDP in a compact manner. However, in general, solutions to factored MDPs do not retain the structure and compactness of the problem representation, forcing approximate solutions, with approximate linear programming (ALP) emerging as a promising MDP-approximation technique. To date, most ALP work has focused on the primal-LP formulation, while the dual LP, which forms the basis for solving constrained Markov problems, has received much less attention. We show that a straightforward linear approximation of the dual optimization variables is problematic, because some of the required computations cannot be carried out efficiently. Nonetheless, we develop a composite approach that symmetrically approximates the primal and dual optimization variables (effectively approximating both the objective function and the feasible region of the LP), leading to a formulation that is computationally feasible and suitable for solving constrained MDPs. We empirically show that this new ALP formulation also performs well on unconstrained problems.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Altman, E.: Constrained Markov decision processes with total cost criteria: occupation measures and primal LP. Methods Models Oper. Res. 43(1), 45–72 (1996)

    MATH  MathSciNet  Google Scholar 

  2. Altman, E.: Constrained Markov decision processes with total cost criteria: Lagrange approach and dual LP. Methods Models Oper. Res. 48, 387–417 (1998)

    MATH  MathSciNet  Google Scholar 

  3. Altman, E., Shwartz, A.: Adaptive control of constrained Markov chains: criteria and policies. Ann. Oper. Res., special issue on Markov Decision Processes 28, 101–134 (1991)

    Article  MATH  MathSciNet  Google Scholar 

  4. Altman, E.: Constrained Markov Decision Processes. Chapman & Hall, London, UK (1999)

    MATH  Google Scholar 

  5. Bellman, R.: Adaptive Control Processes: A Guided Tour. Princeton University Press, Princeton, NJ (1961)

    MATH  Google Scholar 

  6. Bertele, U., Brioschi, F.: Nonserial Dynamic Programming. Academic, New York (1972)

    MATH  Google Scholar 

  7. Bertsekas, D.P., Tsitsiklis, J. N.: Neuro-dynamic Programming. Athena Scientific, Belmont, MA (1996)

    MATH  Google Scholar 

  8. Bertsimas, D., Tsitsiklis, J.N.: Introduction to Linear Optimization. Athena Scientific, Belmont, MA (1997)

    Google Scholar 

  9. Boutilier, C., Dearden, R., Goldszmidt, M.: Exploiting structure in policy construction. In: Proceedings of the Fourteenth International Joint Conference on Artificial Intelligence (IJCAI-95), pp. 1104–1111 (1995)

  10. Boutilier, C., Dearden, R., Goldszmidt, M.: Stochastic dynamic programming with factored representations. Artif. Intell. 121(1,2), 49–107 (2000)

    Article  MATH  MathSciNet  Google Scholar 

  11. de Farias, D.P., Van Roy, B.: The linear programming approach to approximate dynamic programming. Oper. Res. 51(6), 850–856 (2003)

    Article  MathSciNet  Google Scholar 

  12. de Farias, D.P., Van Roy, B.: On constraint sampling in the linear programming approach to approximate dynamic programming. Math. Oper. Res. 29(3), 462– 478 (2004)

    Article  MATH  MathSciNet  Google Scholar 

  13. Dean, T., Kanazawa, K.: A model for reasoning about persistence and causation. Comput. Intell. 5(3), 142–150 (1989)

    Google Scholar 

  14. Dolgov, D.A., Durfee, E.H.: Graphical models in local, asymmetric multi-agent Markov decision processes. In: Proceedings of the Third International Joint Conference on Autonomous Agents and Multiagent Systems (AAMAS-04), pp. 956–963 (2004a)

  15. Dolgov, D.A., Durfee, E.H.: Optimal resource allocation and policy formulation in loosely-coupled Markov decision processes. In: Proceedings of the Fourteenth International Conference on Automated Planning and Scheduling (ICAPS-04), pp. 315–324 (2004b)

  16. Guestrin, C., Koller, D., Parr, R., Venkataraman, S.: Efficient solution algorithms for factored MDPs. J. Artif. Intell. Res. 19, 399–468 (2003)

    MATH  MathSciNet  Google Scholar 

  17. Guestrin, C.: Planning Under Uncertainty in Complex Structured Environments. Ph.D. thesis, Computer Science Department, Stanford University (2003)

  18. Kallenberg, L.: Linear Programming and Finite Markovian Control Problems. Math. Centrum, Amsterdam, Holland (1983)

  19. Koller, D., Parr, R.: Computing factored value functions for policies in structured MDPs. In: Proceedings of the Sixteenth International Joint Conference on Artificial Intelligence IJCAI-99, pp. 1332–1339 (1999)

  20. Patrascu, R., Poupart, P., Schuurmans, D., Boutilier, C., Guestrin, C.: Greedy linear value-approximation for factored Markov decision processes. In: Eighteenth National Conference on Artificial Intelligence, pp. 285–291. American Association for Artificial Intelligence, Menlo Park, CA (2002)

    Google Scholar 

  21. Poupart, P., Boutilier, C., Patrascu, R., Schuurmans, D.: Piecewise linear value function approximation for factored MDPs. In: Eighteenth national conference on Artificial Intelligence, pp. 292–299. American Association for Artificial Intelligence, Menlo Park, CA (2002)

    Google Scholar 

  22. Puterman, M. L.: Markov Decision Processes. Wiley, New York (1994)

    MATH  Google Scholar 

  23. Schuurmans, D., Patrascu, R.: Direct value-approximation for factored MDPs. In: Proceedings of the Fourteenths Neural Information Processing Systems (NIPS) (2001)

  24. Schweitzer, P., Seidmann, A.: Generalized polynomial approximations in Markovian decision processes. J. Math. Anal. Appl. 110, 568–582 (1985)

    Article  MATH  MathSciNet  Google Scholar 

  25. Sutton, R.S., Barto, A.G.: Reinforcement Learning: An Introduction. MIT Press, Cambridge, MA (1998)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Dmitri A. Dolgov.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Dolgov, D.A., Durfee, E.H. Symmetric approximate linear programming for factored MDPs with application to constrained problems. Ann Math Artif Intell 47, 273–293 (2006). https://doi.org/10.1007/s10472-006-9038-x

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10472-006-9038-x

Keywords

Mathematics Subject Classifications (2000)

Navigation