Skip to main content
Log in

Dynamic non-Bayesian decision making in multi-agent systems

  • Published:
Annals of Mathematics and Artificial Intelligence Aims and scope Submit manuscript

Abstract

We consider a group of several non-Bayesian agents that can fully coordinate their activities and share their past experience in order to obtain a joint goal in face of uncertainty. The reward obtained by each agent is a function of the environment state but not of the action taken by other agents in the group. The environment state (controlled by Nature) may change arbitrarily, and the reward function is initially unknown. Two basic feedback structures are considered. In one of them — the perfect monitoring case — the agents are able to observe the previous environment state as part of their feedback, while in the other — the imperfect monitoring case — all that is available to the agents are the rewards obtained. Both of these settings refer to partially observable processes, where the current environment state is unknown. Our study refers to the competitive ratio criterion. It is shown that, for the imperfect monitoring case, there exists an efficient stochastic policy that ensures that the competitive ratio is obtained for all agents at almost all stages with an arbitrarily high probability, where efficiency is measured in terms of rate of convergence. It is also shown that if the agents are restricted only to deterministic policies then such a policy does not exist, even in the perfect monitoring case.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. N. Alon, J.H. Spencer and P. Erdos, The Probabilistic Method (Wiley-Interscience, New York, 1992).

    MATH  Google Scholar 

  2. R.J. Aumann and M.B. Maschler, Repeated Games with Incomplete Information (MIT Press, Cambridge, MA, 1995).

    MATH  Google Scholar 

  3. D. Blackwell, An analog of the minimax theorem for vector payoffs, Pacific Journal of Mathematic 6 (1956) 1-8.

    MATH  MathSciNet  Google Scholar 

  4. R. Brafman and M. Tennenholtz, Axiom systems for qualitative decision criteria, in: Proceedings of AAAI-97 (1997).

  5. H. Chernoff, A measure of the asymptotic efficiency for tests of a hypothesis based on the sum of observations, Annals of Mathematical Statistics 23 (1952) 493-509.

    MATH  MathSciNet  Google Scholar 

  6. D. Fudenberg and D. Levine, Theory of learning in games (memo, 1997).

  7. D. Fudenberg and J. Tirole, Game Theory (MIT Press, Cambridge, MA, 1991).

    Google Scholar 

  8. J.C. Harsanyi, Games with incomplete information played by bayesian players, Parts i, ii, iii, Management Science 14 (1967) 159-182.

    Article  MATH  MathSciNet  Google Scholar 

  9. L.P. Kaelbling, M.L. Littman and A.W. Moore, Reinforcement learning: A survey, Journal of Artificial Intelligence Research 4 (1996) 237-258.

    Google Scholar 

  10. D. Kreps, Notes on the Theory of Choice (Westview Press, 1988).

  11. D. Kreps, A Course in Microeconomic Theory (Princeton University Press, 1990).

  12. M.L. Littman, Markov games as a framework for multi-agent reinforcement learning, in: Proc. 11th Int. Conf. on Machine Learning (1994) pp. 157-163.

  13. R.D. Luce and H. Raiffa, Games and Decisions — Introduction and Critical Survey (Wiley, New York, 1957).

    MATH  Google Scholar 

  14. J.-F. Mertens, S. Sorin and S. Zamir, Repeated games, Part A, CORE, DP-9420 (1995).

  15. J. Milnor, Games against nature, in: Decision Processes, eds. R.M. Thrall, C.H. Coombs and R.L. Davis (Wiley, New York, 1954).

    Google Scholar 

  16. D. Monderer and M. Tennenholtz, Dynamic non-Bayesian decision-making, Journal of Artificial Intelligence Research 7 (1997) 231-248.

    MathSciNet  Google Scholar 

  17. Y. Moses and M. Tennenholtz, Multi-entity models, Machine Intelligence 14 (1995) 63-88.

    MathSciNet  Google Scholar 

  18. C.H. Papadimitriou and M. Yannakakis, Shortest paths without a map, in: Automata, Languages and Programming, 16th International Colloquium Proceedings (1989) pp. 610-620.

  19. S. Russell and P. Norvig, Artificial Intelligence: A Modern Approach (Prentice-Hall, Englewood Cliffs, NJ, 1995).

    MATH  Google Scholar 

  20. L.J. Savage, The Foundations of Statistics (Dover, New York, 1972).

    MATH  Google Scholar 

  21. S. Sen, Adaptation and learning in multiagent systems, IJCAI-95 Workshop Program, Working Notes (1995).

  22. L.S. Shapley, Stochastic games, Proc. Nat. Acad. Sci. U.S.A. 39 (1953) 1095-1100.

    Article  MATH  MathSciNet  Google Scholar 

  23. L.G. Valiant, A theory of the learnable, Comm. ACM 27(11) (1984) 1134-1142.

    Article  MATH  Google Scholar 

  24. M. Wellman and J. Doyle, Modular utility representation for decision-theoretic planning, in: Proceedings of the 1st International Conference on AI Planning Systems (Morgan, San Mateo, CA, 1992).

    Google Scholar 

  25. M.P. Wellman, Reasoning about preference models, Technical Report MIT/LCS/TR-340, Laboratory for Computer Science, MIT (1985).

Download references

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

About this article

Cite this article

Monderer, D., Tennenholtz, M. Dynamic non-Bayesian decision making in multi-agent systems. Annals of Mathematics and Artificial Intelligence 25, 91–106 (1999). https://doi.org/10.1023/A:1018917719749

Download citation

  • Issue Date:

  • DOI: https://doi.org/10.1023/A:1018917719749

Keywords

Navigation