Abstract
In this paper we present a model of two-players partially observable “state-game” and study its optimality. The model is inspired by the practical problem of negotiation in a multi-agent system and formulates, from a the game theory point of view, the so-called contract net protocol. It covers a wide variety of real problems including some simple card games such as blackjack, and many negotiation and bargaining situations. The results that follow are valid for non-zero-sum games as well as for zero-sum games. Basically, we establish and prove the relation between partially observable state games and some classical (single-state) bi-matrix games. If the original state game is zero-sum, then the equivalent bi-matrix game is so.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
C. Boutilier, T. Dean, and S. Hanks, “Decision-theoretic planning: Structural assumptions and compitational leverage,” Journal of Artificial Intelligence Research, vol. 1, pp. 1-93, 1999.
M. Golfarelli, “A game theory approach to coordination in multiagent systems,” in Proc. ECAI 98, Brighton, UK, 1998.
M.L. Littman, “Markov games as a framework for multiagent reinforcement learning,” in Proc. 11th Int. Conf. on Machine Learning, San Francisco, CA, 1994, pp. 157-163.
G. Zlotkin and J.S. Rosenschein, “Mechanism design for automated negotiated, and its application to task oriented domains,” Artificial Intelligence, vol. 86, pp. 195-244, 1996.
E.H. Ephrati and J.S. Rosenschein, “The Clarke Tax as a consensus mechanism among automated agents,” in Proc. AAAI-91, Anaheim (CA), 1991, pp. 173-178.
M. Golfarelli, D. Maio, and S. Rizzi, “Multiagent path planning based on task-swap negotiation,” in Proc. 16th UK Planning and Scheduling Workshop, Durham, UK, 1997.
T.W. Sandholm, “An implementation of the contract net protocol based on marginal cost calculations,” in Proc. 11th National Conf. on Artificial Intelligence, Washington, D.C., 1993, pp. 256-262.
T.W. Sandholm and V.R. Lesser, “Issues in automated negotiation and electronic commerce: Extending the contract net framework,” in Proc. 1st Int. Conf. on Multiagent Systems, San Francisco, 1993, pp. 328-335.
R.G. Smith, “The contract net protocol: High-level communication and control in a distributed problem solver,” IEEE Tran. on Computers, vol. 29 no. 12, pp. 1104-1113, 1980.
J.S. Rosenschein and G. Zlotkin, Rules of Encounter, MIT Press: Cambridge, Massachusetts, 1994.
A.W. Moore and C.G. Atkeson, “Prioritized sweeping-reinforcement learning with less data and less time,” machine learning, vol. 13, pp. 103-130, 1993.
G. Weiß, “Learning to coordinate actions in multi-agent systems,” in Proc. Of the 13th Int. Joint Conf. On Artificial Intelligence, Seattle, 1993, pp. 1079-1085.
D.K. Lewis, Conventions, A Philosophical Study, Harvard University Press: Cambridge, 1969.
Y. Shoham and M. Tenneholtz, “On the synthesis of useful social laws for artificial agent societies,” in Proc. of the 10th National Conf. On Artificial Intelligence, San Jose, 1992, pp. 276-281.
S. Sen, M. Sekaran, and J. Hale, “Learning to coordinate without sharing information,” in Proc. Of the 12th National Conf. On Artificial Intelligence, Seattle, 1994, pp. 426-431.
Y. Shoham and M. Tenneholtz, “Emergent conventions in multiagent systems: Initial experimental results and observations,” in Proc. Of the 3rd Int. Conf. On Principles of Knowledge Representation and Reasoning, Cambridge, 1992, pp. 225-231.
R.J. Aumann and M.B. Maschler, Repeated Games with Incomplete Information, MIT Press: Cambridge, Massachusetts, 1995.
M.J. Osborne and A. Rubistein, A Course in Game Theory, MIT Press: Cambridge, MA, 1994.
O.L. Mangasdarian and H. Stone, “Two-person Nonzero-sum games and quadratic programming,” Journ. Math. Analysis and Applications, vol. 9, pp. 348-355, 1964.
C. Boutilier, “Learning conventions in multiagent stochastic domains using likelihood estimates,” in Proc. 12th Conf. on Uncertainty in AI., Portland, 1996, pp. 106-114.
C. Boutilier, “Planning, learning and coordination in multiagent decision processes,” in Proc. 6th Conf. on Theoretical Aspects of Rationality and Knowledge, Amsterdam, 1996.
H.P. Young, “The evolution of convention,” Econometrica, vol. 61, pp. 57-84, 1993.
M.L. Littman and C. Szepesvári, “A generalized reinforcement-learning model: Convergence and applications,” in Proc. 13th Int. Conf. Machine Learning, 1996, pp. 310-318.
J. Hu and M.P. Wellman, “Multiagent reinforcement learnimg: Theoretical framework and an algorithm,” in Proc. ICML 98, Madisson, WI, pp. 242-250.
T.W. Sandholm and R.H. Crites, “Multiagent reinforcement learning in the iterated prisoner's dilemma,” Biosystems, vol. 37, pp. 147-166, 1995.
D.P. Bertsekas and J.N. Tsitsiklis, Neuro-Dynamic Programming, Athena Scientific: Belmont, MA, 1996.
J.C. Harsanyi, “Games with incomplete information played by Bayesian players,” Part I, II, III, Management Science, vol. 14, pp. 159-182, 320-334, 486-502, 1967/1968.
J. Filar and K. Vrieze, Competitive Markov Decision Processes, Springer-Verlag, 1997.
C. Claus and C. Boutilier, “The dynamics of reinforcement learning in cooperative multiagent systems,” in Proc. AAAI 98, Madison, WI, to appear.
M.D. Resnik, Choices: An Introduction to Decision Theory, University of Minnesota Press: Mineapolis, MN, 1987.
Author information
Authors and Affiliations
Rights and permissions
About this article
Cite this article
Golfarelli, M., Meuleau, N. A Model of Partially Observable State Game and its Optimality. Applied Intelligence 14, 273–284 (2001). https://doi.org/10.1023/A:1011294719852
Issue Date:
DOI: https://doi.org/10.1023/A:1011294719852