Article

Resource allocation among agents with preferences induced by factored MDPs

Authors:

Edmund DurfeeAuthors Info & Claims

AAMAS '06: Proceedings of the fifth international joint conference on Autonomous agents and multiagent systems

Pages 297 - 304

https://doi.org/10.1145/1160633.1160684

Published: 08 May 2006 Publication History

Abstract

Distributing scarce resources among agents in a way that maximizes the social welfare of the group is a computationally hard problem when the value of a resource bundle is not linearly decomposable. Furthermore, the problem of determining the value of a resource bundle can be a significant computational challenge in itself, such as for an agent operating in a stochastic environment, where the value of a resource bundle is the expected payoff of the optimal policy realizable given these resources. Recent work has shown that the structure in agents' preferences induced by stochastic policy-optimization problems (modeled as MDPs) can be exploited to solve the resource-allocation and the policy-optimization problems simultaneously, leading to drastic (often exponential) improvements in computational efficiency. However, previous work used a flat MDP model that scales very poorly. In this work, we present and empirically evaluate a resource-allocation mechanism that achieves much better scaling by using factored MDP models, thus exploiting both the structure in agents' MDP-induced preferences, as well as the structure within agents' MDPs.

References

[1]

S. Arnborg, D. G. Corneil, and A. Proskurowski. Complexity of finding embeddings in a k-tree. SIAM J. Algebraic Discrete Methods, 8(2):277--284, 1987.

Digital Library

[2]

R. Bellman. Adaptive Control Processes: A Guided Tour. Princeton University Press, 1961.

[3]

C. Boutilier, R. Dearden, and M. Goldszmidt. Exploiting structure in policy construction. In Proceedings of the 14th International Joint Conference on AI, pages 1104--1111, 1995.

Digital Library

[4]

C. Boutilier, R. Dearden, and M. Goldszmidt. Stochastic dynamic programming with factored representations. AI, 121(1--2):49--107, 2000.

Digital Library

[5]

D. de Farias and B. Van Roy. On constraint sampling in the linear programming approach to approximate dynamic programming. Mathematics of Operations Research, 29(3):462--478, 2004.

Digital Library

[6]

D. P. de Farias and B. Van Roy. The linear programming approach to approximate dynamic programming. Operations Research, 51(6), 2003.

Digital Library

[7]

D. A. Dolgov and E. H. Durfee. Graphical models in local, asymmetric multi-agent Markov decision processes. In Proc. of the 3rd Int. Joint Conf on Autonomous Agents and Multiagent Systems, 2004.

Digital Library

[8]

D. A. Dolgov and E. H. Durfee. Optimal resource allocation and policy formulation in loosely-coupled Markov decision processes. In Proceedings of the 14th International Conference on Automated Planning and Scheduling, pages 315--324, 2004.

[9]

D. A. Dolgov and E. H. Durfee. Computationally efficient combinatorial auctions for resource allocation in weakly-coupled MDPs. In Proc. of the Fourth International Joint Conference on Autonomous Agents and Multiagent Systems (AAMAS-05), 2005.

Digital Library

[10]

D. A. Dolgov and E. H. Durfee. Symmetric primal-dual approximate linear programming for factored MDPs. In Proc. of the Ninth Int. Symposium on AI and Math. (AI&M-06), January 2006.

[11]

C. Guestrin. Planning Under Uncertainty in Complex Structured Environments. PhD thesis, CS Dept., Stanford University, August 2003.

Digital Library

[12]

C. Guestrin, D. Koller, R. Parr, and S. Venkataraman. Efficient solution algorithms for factored MDPs. Journal of AI Research, 19:399--468, 2003.

Digital Library

[13]

M. I. Jordan. Graphical models. Statistical Science (Special Issue on Bayesian Stat), 19:140--155, 2004.

[14]

D. Koller and R. Parr. Computing factored value functions for policies in structured MDPs. In Proc. of the Sixteenth International Conference on Artificial Intelligence IJCAI-99, pages 1332--1339, 1999.

Digital Library

[15]

K. Larson and T. Sandholm. Mechanism design and deliberative agents. In Proc. of the 4th International Joint Conference on Autonomous Agents and Multiagent Systems, pages 650--656. ACM Press, 2005.

Digital Library

[16]

N. Meuleau, M. Hauskrecht, K.-E. Kim, L. Peshkin, L. Kaelbling, T. Dean, and C. Boutilier. Solving very large weakly coupled Markov decision processes. In AAAI/IAAI, pages 165--172, 1998.

Digital Library

[17]

R. Patrascu, P. Poupart, D. Schuurmans, C. Boutilier, and C. Guestrin. Greedy linear value-approximation for factored Markov decision processes. In Proc. of the 18th National Conf. on AI, pages 285--291, 2002.

Digital Library

[18]

M. L. Puterman. Markov Decision Processes. John Wiley & Sons, New York, 1994.

Digital Library

[19]

M. H. Rothkopf, A. Pekec, and R. M. Harstad. Computationally manageable combinational auctions. Management Science, 44(8):1131--1147, 1998.

Digital Library

[20]

P. Schweitzer and A. Seidmann. Generalized polynomial approximations in Markovian decision processes. Journal of Mathematical Analysis and Applications, 110:568--582, 1985.

Cited By

Chamie MAçikmeşe BMesbahi M(2017)Online learning for Markov decision processes applied to multi-agent systems2017 IEEE 56th Annual Conference on Decision and Control (CDC)10.1109/CDC.2017.8263879(1596-1601)Online publication date: 12-Dec-2017
https://dl.acm.org/doi/10.1109/CDC.2017.8263879
El Chamie MYue Yu Acikmese B(2016)Convex synthesis of randomized policies for controlled Markov chains with density safety upper bound constraints2016 American Control Conference (ACC)10.1109/ACC.2016.7526658(6290-6295)Online publication date: Jul-2016
https://doi.org/10.1109/ACC.2016.7526658
El Chamie MAcikmese B(2016)Convex synthesis of optimal policies for Markov Decision Processes with sequentially-observed transitions2016 American Control Conference (ACC)10.1109/ACC.2016.7525515(3862-3867)Online publication date: Jul-2016
https://doi.org/10.1109/ACC.2016.7525515
Show More Cited By

Recommendations

Resource allocation among agents with MDP-induced preferences

Allocating scarce resources among agents to maximize global utility is, in general, computationally challenging. We focus on problems where resources enable agents to execute actions in stochastic environments, modeled as Markov decision processes (MDPs)...
Combinatorial resource scheduling for multiagent MDPs
AAMAS '07: Proceedings of the 6th international joint conference on Autonomous agents and multiagent systems

Optimal resource scheduling in multiagent systems is a computationally challenging task, particularly when the values of resources are not additive. We consider the combinatorial problem of scheduling the usage of multiple resources among agents that ...
Computationally-efficient combinatorial auctions for resource allocation in weakly-coupled MDPs
AAMAS '05: Proceedings of the fourth international joint conference on Autonomous agents and multiagent systems

The capabilities of an autonomous agent are often determined by the resources that are available to it. We examine the problem of allocating scarce resources among multiple self-interested agents, operating in complex, stochastic environments (modeled ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

AAMAS '06: Proceedings of the fifth international joint conference on Autonomous agents and multiagent systems

May 2006

1631 pages

ISBN:1595933034

DOI:10.1145/1160633

General Chairs:
Hideyuki Nakashima
Future University - Hakodate, Japan
,
Michael Wellman
University of Michigan
,
Program Chairs:
Gerhard Weiss
Technical University Munich, Germany
,
Peter Stone
The University of Texas at Austin

Copyright © 2006 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

IFMAS: The International Foundation for Multiagent Systems
SIGAI: ACM Special Interest Group on Artificial Intelligence
ATAL: The International Workshop on Agent Theories, Architectures, and Languages

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 08 May 2006

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Article

Conference

AAMAS06

Sponsor:

IFMAS
SIGAI
ATAL

AAMAS06: AAMAS '06 - 5th International Joint Conference on Autonomous Agents and Multi-agent Systems 2006

May 8 - 12, 2006

Japan, Hakodate

Acceptance Rates

Overall Acceptance Rate 1,155 of 5,036 submissions, 23%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

7
Total Citations
View Citations
269
Total Downloads

Downloads (Last 12 months)1
Downloads (Last 6 weeks)0

Reflects downloads up to 18 Feb 2025

Other Metrics

View Author Metrics

Citations

Cited By

Chamie MAçikmeşe BMesbahi M(2017)Online learning for Markov decision processes applied to multi-agent systems2017 IEEE 56th Annual Conference on Decision and Control (CDC)10.1109/CDC.2017.8263879(1596-1601)Online publication date: 12-Dec-2017
https://dl.acm.org/doi/10.1109/CDC.2017.8263879
El Chamie MYue Yu Acikmese B(2016)Convex synthesis of randomized policies for controlled Markov chains with density safety upper bound constraints2016 American Control Conference (ACC)10.1109/ACC.2016.7526658(6290-6295)Online publication date: Jul-2016
https://doi.org/10.1109/ACC.2016.7526658
El Chamie MAcikmese B(2016)Convex synthesis of optimal policies for Markov Decision Processes with sequentially-observed transitions2016 American Control Conference (ACC)10.1109/ACC.2016.7525515(3862-3867)Online publication date: Jul-2016
https://doi.org/10.1109/ACC.2016.7525515
Xu YScerri P(2009)Token Based Resource Sharing in Heterogeneous Multi-agent TeamsProceedings of the 12th International Conference on Principles of Practice in Multi-Agent Systems10.1007/978-3-642-11161-7_8(113-126)Online publication date: 15-Dec-2009
https://dl.acm.org/doi/10.1007/978-3-642-11161-7_8
Durfee E(2008)Planning for Coordination and Coordination for PlanningProceedings of the 2008 IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology - Volume 0110.1109/WIIAT.2008.389(1-3)Online publication date: 9-Dec-2008
https://dl.acm.org/doi/10.1109/WIIAT.2008.389
Dolgov DJames MSamples MDurfee EYokoo MHuhns MShehory O(2007)Combinatorial resource scheduling for multiagent MDPsProceedings of the 6th international joint conference on Autonomous agents and multiagent systems10.1145/1329125.1329369(1-8)Online publication date: 14-May-2007
https://dl.acm.org/doi/10.1145/1329125.1329369
Dolgov DDurfee E(2006)Resource allocation among agents with MDP-induced preferencesJournal of Artificial Intelligence Research10.5555/1622572.162258727:1(505-549)Online publication date: 1-Dec-2006
https://dl.acm.org/doi/10.5555/1622572.1622587

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Table of Conten