ABSTRACT
We have developed a new series of multi-agent reinforcement learning algorithms that choose a policy based on beliefs about co-players' policies. The algorithms are applicable to situations where a state is fully observable by the agents, but there is no limit on the number of players. Some of the algorithms employ embedded beliefs to handle the cases that co-players are also choosing a policy based on their beliefs of others' policies. Simulation experiments on Iterated Prisoners' Dilemma games show that the algorithms using on policy-based belief converge to highly mutually-cooperative behavior, unlike the existing algorithms based on action-based belief.
- C. Claus and C. Boutilier. The dynamics of reinforcement learning in cooperative multiagent systems. In Proc. of AAAI-98, pages 746--752. 1998. Google ScholarDigital Library
- J. Hu and M. P. Wellman. Multiagent reinforcement learning: Theoretical framework and an algorithm. In Proc. of ICML 1998, pages 242--250. 1998. Google ScholarDigital Library
- M. L. Littman. Markov games as a framework for multi-agent reinforcement learning. In Proc. of ICML 1994, pages 157--163.1994.Google ScholarCross Ref
- T. Makino and K. Aihara. Self-observation principle for estimating the other's internal state. Mathematical Engineering Technical Reports METR 2003--36, the University of Tokyo, Oct. 2003.Google Scholar
- L. Panait and S. Luke. Cooperative multi-agent learning: The state of the art. Autonomous Agents and Multi-Agent Systems, 11:387--434, 2005. Google ScholarDigital Library
- Y. Shoham, R. Powers, and T. Grenager. On the agenda(s) of research on multi-agent learning. In Proc. of Artificial Multiagent Learning: Papers from the 2004 AAAI Fall Symposium, Technical Report FS-04-02. 2004.Google Scholar
- M. Weinberg and J. S. Rosenschein. Best-response multiagent learning in non-stationary environments. In Proc. of AAMAS'04, pages 506--513, 2004. Google ScholarDigital Library
Recommendations
Changing conditional beliefs unconditionally
TARK '96: Proceedings of the 6th conference on Theoretical aspects of rationality and knowledgeAlthough the AGM account of belief change tells us how to change <i>un</i>conditional beliefs, it fails to guide us in changing our conditional beliefs. That explains why, in the AGM account of belief change proper, a decent account of iterated belief ...
Beliefs in agent implementation
DALT'05: Proceedings of the Third international conference on Declarative Agent Languages and TechnologiesThis paper extends a programming language for implementing cognitive agents with the capability to explicitly represent beliefs and reason about them. In this programming language, the beliefs of agents are implemented by modal logic programs, where ...
Reconstructing an Agent's Epistemic State from Observations about its Beliefs and Non-beliefs
We look at the problem in belief revision of trying to make inferences about what an agent believed—or will believe—at a given moment, based on an observation of how the agent has responded to some sequence of previous belief revision inputs over time. ...
Comments