Abstract:
The theory of games involving players who adaptively learn from their past experiences is not yet well understood. We analyze games in which players make on each turn a p...Show MoreMetadata
Abstract:
The theory of games involving players who adaptively learn from their past experiences is not yet well understood. We analyze games in which players make on each turn a probabilistic choice of actions determined by a kth-order Markov process which signifies how they learn from their past k actions for a fixed number k. As the number of states in such Markov processes grows exponentially with k, the analysis of games involving learners with long memories has been viewed as computationally intractable. This study develops a technique which enables feasible analysis of these long-memory Markov process. We further show that, for two players involved in an iterated prisoners' dilemma, the probability of mutual defection increases with the size of their memories. This result is consistent with the classical prisoners' dilemma with two rational players.
Date of Conference: 03-06 December 2014
Date Added to IEEE Xplore: 19 February 2015
Electronic ISBN:978-1-4799-5955-6