Multiagent Planning with Trembling-Hand Perfect Equilibrium in Multiagent POMDPs

Yabu, Yuichi; Yokoo, Makoto; Iwasaki, Atsushi

doi:10.1007/978-3-642-01639-4_2

Multiagent Planning with Trembling-Hand Perfect Equilibrium in Multiagent POMDPs

Yuichi Yabu²²,
Makoto Yokoo²² &
Atsushi Iwasaki²²

Conference paper

713 Accesses

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 5044))

Abstract

Multiagent Partially Observable Markov Decision Processes are a popular model of multiagent systems with uncertainty. Since the computational cost for finding an optimal joint policy is prohibitive, a Joint Equilibrium-based Search for Policies with Nash Equilibrium (JESP-NE) is proposed that finds a locally optimal joint policy in which each policy is a best response to other policies; i.e., the joint policy is a Nash equilibrium.

One limitation of JESP-NE is that the quality of the obtained joint policy depends on the predefined default policy. More specifically, when finding a best response, if some observation have zero probabilities, JESP-NE uses this default policy. If the default policy is quite bad, JESP-NE tends to converge to a sub-optimal joint policy.

In this paper, we propose a method that finds a locally optimal joint policy based on a concept called Trembling-hand Perfect Equilibrium (TPE). In finding a TPE, we assume that an agent might make a mistake in selecting its action with small probability. Thus, an observation with zero probability in JESP-NE will have non-zero probability. We no longer use the default policy. As a result, JESP-TPE can converge to a better joint policy than the JESP-NE, which we confirm this fact by experimental evaluations.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Beard, R.W., McLain, T.W.: Multiple uav cooperative search under collision avoidance and limited range communication constraints. In: Proceedings of the 42nd Conference Decision and Control, pp. 25–30. IEEE, Los Alamitos (2003)
Google Scholar
Nair, R., Tambe, M.: Hybrid BDI-POMDP framework for multiagent teaming. Journal of Artificial Intelligence Research 17, 171–228 (2002)
MathSciNet MATH Google Scholar
Lesser, V., Ortiz, C., Tambe, M.: Distributed Sensor Networks: A Multiagent Perspective. Kluwer, Dordrecht (2003)
Book MATH Google Scholar
Xuan, P., Lesser, V., Zilberstein, S.: Communication decisions in Multiagent cooperation. In: Proceedings of the Fifth International Conference on Autonomous Agents, pp. 616–623 (2001)
Google Scholar
Goldman, C.V., Zilberstein, S.: Optimizing information exchange in cooperative multi-agent systems. In: Proceedings of the Second International Joint Conference on Agents and Multiagent Systems (AAMAS 2003), pp. 137–144 (2003)
Google Scholar
Nair, R., Tambe, M., Marsella, S.: Role allocation and reallocation in multiagent teams: Towards a practical analysis. In: Proceedings of the Second International Joint Conference on Agents and Multiagent Systems (AAMAS 2003), pp. 552–559 (2003)
Google Scholar
Bernstein, D.S., Zilberstein, S., Immerman, N.: The complexity of decentralized control of markov decision processes. In: Proceedings of the 16th Conference on Uncertainty in Artificial Intelligence (UAI 2000), pp. 32–37 (2000)
Google Scholar
Nair, R., Roth, M., Yokoo, M., Tambe, M.: Communication for improving policy computation in distributed pomdps. In: Proceedings of the Third International Joint Conference on Agents and Multiagent Systems (AAMAS 2004), pp. 1098–1105 (2004)
Google Scholar
Selten, R.: Reexamination of the perfectness concept for equilibrium points in extensive games. International Journal of Game Theory 4, 25–55 (1975)
Article MathSciNet MATH Google Scholar
Pynadath, D.V., Tambe, M.: The communicative multiagent team decision problem: Analyzing teamwork theories and models. Journal of Artificial Intelligence Research 16, 389–423 (2002)
MathSciNet MATH Google Scholar
Mas-Colell, A., Whinston, M.D., Green, J.R.: Microeconomic Theory. Oxford University Press, Oxford (1995)
MATH Google Scholar

Download references

Author information

Authors and Affiliations

Graduate School of ISEE, Kyushu University, 744 Motooka, Nishi-ku, Fukuoka, 819-0395, Japan
Yuichi Yabu, Makoto Yokoo & Atsushi Iwasaki

Authors

Yuichi Yabu
View author publications
You can also search for this author in PubMed Google Scholar
Makoto Yokoo
View author publications
You can also search for this author in PubMed Google Scholar
Atsushi Iwasaki
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

School of Computer Science and Software Engineering, University of Wollongong, NSW 2522, Wollongong, Australia
Aditya Ghose
National ICT Australia, Queensland Research Laboratory, St Lucia, Queensland, Australia
Guido Governatori
Rangsit University, Bangkok, Thailand
Ramakoti Sadananda

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Yabu, Y., Yokoo, M., Iwasaki, A. (2009). Multiagent Planning with Trembling-Hand Perfect Equilibrium in Multiagent POMDPs. In: Ghose, A., Governatori, G., Sadananda, R. (eds) Agent Computing and Multi-Agent Systems. PRIMA 2007. Lecture Notes in Computer Science(), vol 5044. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-01639-4_2

Download citation

DOI: https://doi.org/10.1007/978-3-642-01639-4_2
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-01638-7
Online ISBN: 978-3-642-01639-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics