ABSTRACT
A problem of planning for cooperative teams under uncertainty is a crucial one in multiagent systems. Decentralized partially observable Markov decision processes (DEC-POMDPs) provide a convenient, but intractable model for specifying planning problems in cooperative teams. Compared to the single-agent case, an additional challenge is posed by the lack of free communication between the teammates. We argue, that acting close to optimally in a team involves a tradeoff between opportunistically taking advantage of agent's local observations and being predictable for the teammates. We present a more opportunistic version of an existing approximate algorithm for DEC-POMDPs and investigate the tradeoff. Preliminary evaluation shows that in certain settings oportunistic modification provides significantly better performance.
- R. Becker, S. Zilberstein, V. Lesser, and C. V. Goldman. Solving transition independent decentralized Markov decision processes. JAIR, 22:423--455, 2004. Google ScholarDigital Library
- D. S. Bernstein, E. A. Hansen, and S. Zilberstein. Bounded policy iteration for decentralized POMDPs. In IJCAI, 2005. Google ScholarDigital Library
- D. S. Bernstein, R. Givan, N. Immerman, and S. Zilberstein. The complexity of decentralized control of Markov decision processes. Mathematics of Operations Research, 27(4):819--840, 2002. Google ScholarDigital Library
- R. Emery-Montemerlo, G. Gordon, J. Schneider, and S. Thrun. Approximate solutions for partially observable stochastic games with common payoffs. In Proc. AAMAS, 2004. Google ScholarDigital Library
- R. Emery-Montemerlo, G. Gordon, J. Schneider, and S. Thrun. Game theoretic control for robot teams. In Proc. ICRA, 2005.Google ScholarCross Ref
- E. A. Hansen, D. S. Bernstein, and S. Zilberstein. Dynamic programming for partially observable stochastic games. In Proc. AAAI, 2004. Google ScholarDigital Library
- M. Koes, I. Nourbakhsh, and K. Sycara. Constraint optimization coordination architecture for search and rescue robotics. In Proc. ICRA, pages 3977--3982, May 2006.Google ScholarCross Ref
- V. Lesser, C. Ortiz, and M. Tambe, editors. Distributed Sensor Networks: A Multiagent Perspective, 2003. Google ScholarDigital Library
- R. Nair, M. Tambe, M. Yokoo, D. V. Pynadath, and S. Marsella. Taming decentralized POMDPs: towards efficient policy computation for multiagent settings. In Proc. IJCAI, 2004. Google ScholarDigital Library
- M. Roth, R. Simmons, and M. Veloso. Decentralized communication strategies for coordinated multi-agent policies. Multi-Robot Systems: From Swarms to Intelligent Automat, III, 2005.Google Scholar
- M. T. J. Spaan, G. J. Gordon, and N. A. Vlassis. Decentralized planning under uncertainty for teams of communicating agents. In Proc. AAMAS, 2006. Google ScholarDigital Library
- D. Szer, F. Charpillet, and S. Zilberstein. MAA*: a heuristic search algorithm for solving decentralized POMDPs. In Proc. UAI, 2005.Google Scholar
Index Terms
- Subjective approximate solutions for decentralized POMDPs
Recommendations
Modeling plan coordination in multiagent decision processes
AAMAS '07: Proceedings of the 6th international joint conference on Autonomous agents and multiagent systemsIn multiagent planning, it is often convenient to view a problem as two subproblems: agent local planning and coordination. Thus, we can classify agent activities into two categories: agent local problem solving activities and coordination activities, ...
Optimal and approximate Q-value functions for decentralized POMDPs
Decision-theoretic planning is a popular approach to sequential decision making problems, because it treats uncertainty in sensing and acting in a principled way. In single-agent frameworks like MDPs and POMDPs, planning can be carried out by resorting ...
Constraint-based dynamic programming for decentralized POMDPs with structured interactions
AAMAS '09: Proceedings of The 8th International Conference on Autonomous Agents and Multiagent Systems - Volume 1Decentralized partially observable MDPs (DEC-POMDPs) provide a rich framework for modeling decision making by a team of agents. Despite rapid progress in this area, the limited scalability of solution techniques has restricted the applicability of the ...
Comments