Dynamic non-Bayesian decision making in multi-agent systems

Monderer, Dov; Tennenholtz, Moshe

doi:10.1023/A:1018917719749

Dynamic non-Bayesian decision making in multi-agent systems

Published: July 1999

Volume 25, pages 91–106, (1999)
Cite this article

Annals of Mathematics and Artificial Intelligence Aims and scope Submit manuscript

Dov Monderer¹ &
Moshe Tennenholtz¹

70 Accesses
1 Citation
Explore all metrics

Abstract

We consider a group of several non-Bayesian agents that can fully coordinate their activities and share their past experience in order to obtain a joint goal in face of uncertainty. The reward obtained by each agent is a function of the environment state but not of the action taken by other agents in the group. The environment state (controlled by Nature) may change arbitrarily, and the reward function is initially unknown. Two basic feedback structures are considered. In one of them — the perfect monitoring case — the agents are able to observe the previous environment state as part of their feedback, while in the other — the imperfect monitoring case — all that is available to the agents are the rewards obtained. Both of these settings refer to partially observable processes, where the current environment state is unknown. Our study refers to the competitive ratio criterion. It is shown that, for the imperfect monitoring case, there exists an efficient stochastic policy that ensures that the competitive ratio is obtained for all agents at almost all stages with an arbitrarily high probability, where efficiency is measured in terms of rate of convergence. It is also shown that if the agents are restricted only to deterministic policies then such a policy does not exist, even in the perfect monitoring case.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Reinforcement learning in a continuum of agents

Article 13 October 2017

Dynamic Coordination of Multiple Agents in a Class of Differential Games Through a Generalized Linear Reward Scheme

Information Considerations in Multi-Person Cooperative Control/Decision Problems: Information Sets, Sufficient Information Flows, and Risk-Averse Decision Rules for Performance Robustness

References

N. Alon, J.H. Spencer and P. Erdos, The Probabilistic Method (Wiley-Interscience, New York, 1992).
MATH Google Scholar
R.J. Aumann and M.B. Maschler, Repeated Games with Incomplete Information (MIT Press, Cambridge, MA, 1995).
MATH Google Scholar
D. Blackwell, An analog of the minimax theorem for vector payoffs, Pacific Journal of Mathematic 6 (1956) 1-8.
MATH MathSciNet Google Scholar
R. Brafman and M. Tennenholtz, Axiom systems for qualitative decision criteria, in: Proceedings of AAAI-97 (1997).
H. Chernoff, A measure of the asymptotic efficiency for tests of a hypothesis based on the sum of observations, Annals of Mathematical Statistics 23 (1952) 493-509.
MATH MathSciNet Google Scholar
D. Fudenberg and D. Levine, Theory of learning in games (memo, 1997).
D. Fudenberg and J. Tirole, Game Theory (MIT Press, Cambridge, MA, 1991).
Google Scholar
J.C. Harsanyi, Games with incomplete information played by bayesian players, Parts i, ii, iii, Management Science 14 (1967) 159-182.
Article MATH MathSciNet Google Scholar
L.P. Kaelbling, M.L. Littman and A.W. Moore, Reinforcement learning: A survey, Journal of Artificial Intelligence Research 4 (1996) 237-258.
Google Scholar
D. Kreps, Notes on the Theory of Choice (Westview Press, 1988).
D. Kreps, A Course in Microeconomic Theory (Princeton University Press, 1990).
M.L. Littman, Markov games as a framework for multi-agent reinforcement learning, in: Proc. 11th Int. Conf. on Machine Learning (1994) pp. 157-163.
R.D. Luce and H. Raiffa, Games and Decisions — Introduction and Critical Survey (Wiley, New York, 1957).
MATH Google Scholar
J.-F. Mertens, S. Sorin and S. Zamir, Repeated games, Part A, CORE, DP-9420 (1995).
J. Milnor, Games against nature, in: Decision Processes, eds. R.M. Thrall, C.H. Coombs and R.L. Davis (Wiley, New York, 1954).
Google Scholar
D. Monderer and M. Tennenholtz, Dynamic non-Bayesian decision-making, Journal of Artificial Intelligence Research 7 (1997) 231-248.
MathSciNet Google Scholar
Y. Moses and M. Tennenholtz, Multi-entity models, Machine Intelligence 14 (1995) 63-88.
MathSciNet Google Scholar
C.H. Papadimitriou and M. Yannakakis, Shortest paths without a map, in: Automata, Languages and Programming, 16th International Colloquium Proceedings (1989) pp. 610-620.
S. Russell and P. Norvig, Artificial Intelligence: A Modern Approach (Prentice-Hall, Englewood Cliffs, NJ, 1995).
MATH Google Scholar
L.J. Savage, The Foundations of Statistics (Dover, New York, 1972).
MATH Google Scholar
S. Sen, Adaptation and learning in multiagent systems, IJCAI-95 Workshop Program, Working Notes (1995).
L.S. Shapley, Stochastic games, Proc. Nat. Acad. Sci. U.S.A. 39 (1953) 1095-1100.
Article MATH MathSciNet Google Scholar
L.G. Valiant, A theory of the learnable, Comm. ACM 27(11) (1984) 1134-1142.
Article MATH Google Scholar
M. Wellman and J. Doyle, Modular utility representation for decision-theoretic planning, in: Proceedings of the 1st International Conference on AI Planning Systems (Morgan, San Mateo, CA, 1992).
Google Scholar
M.P. Wellman, Reasoning about preference models, Technical Report MIT/LCS/TR-340, Laboratory for Computer Science, MIT (1985).

Download references

Author information

Authors and Affiliations

Faculty of Industrial Engineering and Management, Technion — Israel Institute of Technology, Haifa, 32000, Israel
Dov Monderer & Moshe Tennenholtz

Authors

Dov Monderer
View author publications
You can also search for this author in PubMed Google Scholar
Moshe Tennenholtz
View author publications
You can also search for this author in PubMed Google Scholar

Rights and permissions

Reprints and permissions

About this article

Cite this article

Monderer, D., Tennenholtz, M. Dynamic non-Bayesian decision making in multi-agent systems. Annals of Mathematics and Artificial Intelligence 25, 91–106 (1999). https://doi.org/10.1023/A:1018917719749

Download citation

Issue Date: July 1999
DOI: https://doi.org/10.1023/A:1018917719749

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Dynamic non-Bayesian decision making in multi-agent systems

Abstract

Access this article

Similar content being viewed by others

Reinforcement learning in a continuum of agents

Dynamic Coordination of Multiple Agents in a Class of Differential Games Through a Generalized Linear Reward Scheme

Information Considerations in Multi-Person Cooperative Control/Decision Problems: Information Sets, Sufficient Information Flows, and Risk-Averse Decision Rules for Performance Robustness

References

Author information

Authors and Affiliations

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Dynamic non-Bayesian decision making in multi-agent systems

Abstract

Access this article

Similar content being viewed by others

Reinforcement learning in a continuum of agents

Dynamic Coordination of Multiple Agents in a Class of Differential Games Through a Generalized Linear Reward Scheme

Information Considerations in Multi-Person Cooperative Control/Decision Problems: Information Sets, Sufficient Information Flows, and Risk-Averse Decision Rules for Performance Robustness

References

Author information

Authors and Affiliations

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation