skip to main content
10.1145/1160633.1160684acmconferencesArticle/Chapter ViewAbstractPublication PagesaamasConference Proceedingsconference-collections
Article

Resource allocation among agents with preferences induced by factored MDPs

Published: 08 May 2006 Publication History

Abstract

Distributing scarce resources among agents in a way that maximizes the social welfare of the group is a computationally hard problem when the value of a resource bundle is not linearly decomposable. Furthermore, the problem of determining the value of a resource bundle can be a significant computational challenge in itself, such as for an agent operating in a stochastic environment, where the value of a resource bundle is the expected payoff of the optimal policy realizable given these resources. Recent work has shown that the structure in agents' preferences induced by stochastic policy-optimization problems (modeled as MDPs) can be exploited to solve the resource-allocation and the policy-optimization problems simultaneously, leading to drastic (often exponential) improvements in computational efficiency. However, previous work used a flat MDP model that scales very poorly. In this work, we present and empirically evaluate a resource-allocation mechanism that achieves much better scaling by using factored MDP models, thus exploiting both the structure in agents' MDP-induced preferences, as well as the structure within agents' MDPs.

References

[1]
S. Arnborg, D. G. Corneil, and A. Proskurowski. Complexity of finding embeddings in a k-tree. SIAM J. Algebraic Discrete Methods, 8(2):277--284, 1987.
[2]
R. Bellman. Adaptive Control Processes: A Guided Tour. Princeton University Press, 1961.
[3]
C. Boutilier, R. Dearden, and M. Goldszmidt. Exploiting structure in policy construction. In Proceedings of the 14th International Joint Conference on AI, pages 1104--1111, 1995.
[4]
C. Boutilier, R. Dearden, and M. Goldszmidt. Stochastic dynamic programming with factored representations. AI, 121(1--2):49--107, 2000.
[5]
D. de Farias and B. Van Roy. On constraint sampling in the linear programming approach to approximate dynamic programming. Mathematics of Operations Research, 29(3):462--478, 2004.
[6]
D. P. de Farias and B. Van Roy. The linear programming approach to approximate dynamic programming. Operations Research, 51(6), 2003.
[7]
D. A. Dolgov and E. H. Durfee. Graphical models in local, asymmetric multi-agent Markov decision processes. In Proc. of the 3rd Int. Joint Conf on Autonomous Agents and Multiagent Systems, 2004.
[8]
D. A. Dolgov and E. H. Durfee. Optimal resource allocation and policy formulation in loosely-coupled Markov decision processes. In Proceedings of the 14th International Conference on Automated Planning and Scheduling, pages 315--324, 2004.
[9]
D. A. Dolgov and E. H. Durfee. Computationally efficient combinatorial auctions for resource allocation in weakly-coupled MDPs. In Proc. of the Fourth International Joint Conference on Autonomous Agents and Multiagent Systems (AAMAS-05), 2005.
[10]
D. A. Dolgov and E. H. Durfee. Symmetric primal-dual approximate linear programming for factored MDPs. In Proc. of the Ninth Int. Symposium on AI and Math. (AI&M-06), January 2006.
[11]
C. Guestrin. Planning Under Uncertainty in Complex Structured Environments. PhD thesis, CS Dept., Stanford University, August 2003.
[12]
C. Guestrin, D. Koller, R. Parr, and S. Venkataraman. Efficient solution algorithms for factored MDPs. Journal of AI Research, 19:399--468, 2003.
[13]
M. I. Jordan. Graphical models. Statistical Science (Special Issue on Bayesian Stat), 19:140--155, 2004.
[14]
D. Koller and R. Parr. Computing factored value functions for policies in structured MDPs. In Proc. of the Sixteenth International Conference on Artificial Intelligence IJCAI-99, pages 1332--1339, 1999.
[15]
K. Larson and T. Sandholm. Mechanism design and deliberative agents. In Proc. of the 4th International Joint Conference on Autonomous Agents and Multiagent Systems, pages 650--656. ACM Press, 2005.
[16]
N. Meuleau, M. Hauskrecht, K.-E. Kim, L. Peshkin, L. Kaelbling, T. Dean, and C. Boutilier. Solving very large weakly coupled Markov decision processes. In AAAI/IAAI, pages 165--172, 1998.
[17]
R. Patrascu, P. Poupart, D. Schuurmans, C. Boutilier, and C. Guestrin. Greedy linear value-approximation for factored Markov decision processes. In Proc. of the 18th National Conf. on AI, pages 285--291, 2002.
[18]
M. L. Puterman. Markov Decision Processes. John Wiley & Sons, New York, 1994.
[19]
M. H. Rothkopf, A. Pekec, and R. M. Harstad. Computationally manageable combinational auctions. Management Science, 44(8):1131--1147, 1998.
[20]
P. Schweitzer and A. Seidmann. Generalized polynomial approximations in Markovian decision processes. Journal of Mathematical Analysis and Applications, 110:568--582, 1985.

Cited By

View all
  • (2017)Online learning for Markov decision processes applied to multi-agent systems2017 IEEE 56th Annual Conference on Decision and Control (CDC)10.1109/CDC.2017.8263879(1596-1601)Online publication date: 12-Dec-2017
  • (2016)Convex synthesis of randomized policies for controlled Markov chains with density safety upper bound constraints2016 American Control Conference (ACC)10.1109/ACC.2016.7526658(6290-6295)Online publication date: Jul-2016
  • (2016)Convex synthesis of optimal policies for Markov Decision Processes with sequentially-observed transitions2016 American Control Conference (ACC)10.1109/ACC.2016.7525515(3862-3867)Online publication date: Jul-2016
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
AAMAS '06: Proceedings of the fifth international joint conference on Autonomous agents and multiagent systems
May 2006
1631 pages
ISBN:1595933034
DOI:10.1145/1160633
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 08 May 2006

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. (multi-)agent planning
  2. task and resource allocation in agent systems

Qualifiers

  • Article

Conference

AAMAS06
Sponsor:

Acceptance Rates

Overall Acceptance Rate 1,155 of 5,036 submissions, 23%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)1
  • Downloads (Last 6 weeks)0
Reflects downloads up to 18 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2017)Online learning for Markov decision processes applied to multi-agent systems2017 IEEE 56th Annual Conference on Decision and Control (CDC)10.1109/CDC.2017.8263879(1596-1601)Online publication date: 12-Dec-2017
  • (2016)Convex synthesis of randomized policies for controlled Markov chains with density safety upper bound constraints2016 American Control Conference (ACC)10.1109/ACC.2016.7526658(6290-6295)Online publication date: Jul-2016
  • (2016)Convex synthesis of optimal policies for Markov Decision Processes with sequentially-observed transitions2016 American Control Conference (ACC)10.1109/ACC.2016.7525515(3862-3867)Online publication date: Jul-2016
  • (2009)Token Based Resource Sharing in Heterogeneous Multi-agent TeamsProceedings of the 12th International Conference on Principles of Practice in Multi-Agent Systems10.1007/978-3-642-11161-7_8(113-126)Online publication date: 15-Dec-2009
  • (2008)Planning for Coordination and Coordination for PlanningProceedings of the 2008 IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology - Volume 0110.1109/WIIAT.2008.389(1-3)Online publication date: 9-Dec-2008
  • (2007)Combinatorial resource scheduling for multiagent MDPsProceedings of the 6th international joint conference on Autonomous agents and multiagent systems10.1145/1329125.1329369(1-8)Online publication date: 14-May-2007
  • (2006)Resource allocation among agents with MDP-induced preferencesJournal of Artificial Intelligence Research10.5555/1622572.162258727:1(505-549)Online publication date: 1-Dec-2006

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media