Abstract
This paper studies the impact of an improvement of information structure upon the perfect public equilibrium payoff set in discounted stochastic games with imperfect public monitoring. We first suggest three partial orders on information structures in stochastic games. Although each of them reduces to the notion of garbling in repeated games (Kandori in Rev Econ Stud 59:581–593, 1992), we find that an improvement of information in terms of our two garbling notions does not imply an expansion of the equilibrium payoff set for some games. We also show that more informativeness in terms of our third notion of garbling is sufficient for the expansion, thereby extending the well-known monotonicity result in Kandori (1992) to stochastic games.

Similar content being viewed by others
Explore related subjects
Discover the latest articles and news from researchers in related subjects, suggested using machine learning.Notes
Kandori (1992) originally calls it quasi-garbling.
Kandori’s monotonicity result is originally about the pure strategy sequential equilibrium payoff set; our result is with respect to the perfect public equilibrium payoff set. He assumes full support of public monitoring signals, which implies that the set of pure strategy sequential equilibrium payoffs is identical to that of pure PPE payoffs; whereas we do not assume full support. We allow mixed public strategies.
See, for example, Hirshleifer (1971). Pęski (2008) [see also Gossner and Mertens (2001)] considers a smaller class of games, i.e., zero-sum games, and thereby obtains an analogous result to Blackwell (1951). Gossner (2000) provides a notion of informativeness for comparing the set of correlated equilibrium outcomes (and Bayes Nash equilibrium in games with incomplete information). See also Lehrer et al. (2013) for further results in this line.
For this reason, it is immaterial in our analysis which of the two is first observed by players.
For a set B, \(\Delta (B)\) denotes the set of probability distributions on B.
We may interpret player i’s (ex-ante) utility function as it is derived from an ex-post utility function \(u_i : S \times A_i \times Y \rightarrow \mathbb {R}\):
$$\begin{aligned}g_i (s, a) \equiv \sum _{ ( t, y ) \in S \times Y} p(t,y|s,a) u_i (s,a_i, y)\end{aligned}$$for each \((s,a) \in S \times A\). Note that observing a realization of the ex-post payoff does not give further information about \(a_{-i}\) given they observe s and y.
This assumption is trivially satisfied when players’ ex-post utility function does not depend on a public signal. In this case, the signal is employed purely for monitoring.
\(\mathbb {I}_A\) is the indicator function, i.e. \(\mathbb {I}_A= 1\) if A is true, and 0 otherwise.
Given \(\Pi = (Y,f)\), for \(\xi : S \times Y \rightarrow \mathbb {R}^N\), let \(\mathbb {E}^\Pi [ \xi (t,y) | s, \alpha ] \equiv \sum _{ t \in S, y \in Y, a \in A} \alpha (a) f (y|t,s,a) q( t | s,a) \xi (t,y)\) for all mixed action profile \(\alpha \) and state s.
Since there is always chance of transition to \(s_2\), if g is too big there is no \(\delta \) to satisfy the inequality.
Given a set A, co(A) denotes the convex hull of A.
Proposition 2 is proved without a public randomization device, but it is easy to extend to allow for public randomization.
Yamamoto (2016) studies this environment with perfect monitoring and provides a folk theorem. When monitoring is perfect, a recursive characterization is still available as we could regard the public belief for the state as a state.
References
Blackwell D (1951) Comparison of experiments. Proc Second Berkeley Symp Math Stat Probab 1:93–102
Dilip A, David P, Ennio S (1990) Toward a theory of discounted repeated games with imperfect monitoring. Econometrica 58(5):1041–1063
Drew F, David L, Eric M (1994) The folk theorem with imperfect public information. Econometrica 62(5):997–1039
Fudenberg D, Yamamoto Y (2011) The folk theorem for irreducible stochastic games with imperfect public monitoring. J Econ Theory 146:1664–1683
Gossner O (2000) Comparison of information structures. Games Econ Behav 30:44–63
Hirshleifer J (1971) The private and social value of information and the reward to inventive activity. Am Econ Rev 61:561–574
Hörner J, Sugaya T, Takahashi S, Vieille N (2011) Recursive methods in discounted stochastic games: an algorithm for \(\delta \rightarrow 1\) and a folk theorem. Econometrica 79:1277–1318
Kamada Y, Kominers SD (2010) Information can wreck cooperation: a counterpoint to kandori (1992). Econ Lett 107:112–114
Kandori M (1992) The use of information in repeated games with imperfect monitoring. Rev Econ Stud 59:581–593
Kandori M, Obara I (2006b) Less is more: an observability paradox in repeated games. Int J Game Theory 34:475–493
Kloosterman A (2015) Public information in markov games. J Econ Theory 157:28–48
Lehrer E, Rosenberg D, Shmaya E (2013) Garbling of signals and outcome equivalence. Games Econ Behav 81:179–191
Mailath GJ, Matthews SA, Sekiguchi T (2002) Private strategies in finitely repeated games with imperfect public monitoring. Contrib Theoret Econ. https://doi.org/10.2202/1534-5971.1046
Michihiro K, Ichiro O (2006a) Efficiency in repeated games revisited: the role of private strategies. Econometrica 74(2):499–519
Olivier G, Jean-Francois M (2001) The value of information in zero-sum games (preprint)
Pęski M (2008) Comparison of information structures in zero-sum games. Games Econ Behav 62:732–735
Sugaya T, Wolitzky A (2017) Bounding equilibrium payoffs in repeated games with private monitoring. Theoret Econ 12:691–729
Yuichi Y (2016) Stochastic games with hidden states (Unpublished)
Author information
Authors and Affiliations
Corresponding author
Additional information
I owe many thanks to Ichiro Obara for his continuous guidance, encouragement, and invaluable advice. I am also grateful to a co-editor and two anonymous referees for insightful comments. All remaining errors are my own.
Appendix
Appendix
1.1 A Omitted proofs
1.1.1 A.1 Proof of Lemma 2
Proof
Fix \(\delta \in [0,1)\). Let \(W \subseteq \mathbb {R}^{ |S| \times N}\) be self-generating. Then, for each \(v = (v_1, \dots , v_{|S|}) \in W\) and \(s \in S\), we can find \(\alpha ^v_s \in \prod _{ i \in I} \Delta A_i\) and \(w^v_s : Y \rightarrow W\) which decomposes \(v_s\) for each s. If there are more than one such pairs, choose one among them arbitrarily.
Pick \(v = (v_1, \dots , v_{|S|}) \in B(W)\). We want to show that for each state s, \(v_s \in E (s; \delta )\). Let \(\tilde{\mathcal {H}}_P^0 \equiv \{ \varnothing \}\) and \(\tilde{\mathcal {H}}_P^k \equiv \{ \varnothing \} \times (S \times Y)^k\) for \(k \ge 1\), i.e., the set of ex-ante public histories (that is, history before\(s^k\) is realized). For each \(k \ge 0\), define \(\nu ^k: \tilde{\mathcal {H}}_P^k \rightarrow W\) recursively as follows: let \(\nu ^0 ( \varnothing ) := v\) and, for each \(k \ge 1\) and \(\tilde{h}^k = (\tilde{h}^{k-1}, s, y)\) where \(\tilde{h}^{k-1} \in \tilde{H}_P^{k-1}\), \(s \in S\) and \(y \in Y\),
That is, \(\nu ^k (\tilde{h}^k) \in \mathbb {R}^{|S| \times N}\) is the continuation payoff vector corresponding to ex-ante history \(\tilde{h}^k\).
Define a public strategy as follows: for any \(k \ge 0\) and (ex-post) public history \(h^k \in \mathcal {H}_P^k\) (see Sect. 2)
where \(\tilde{h^k} \in \tilde{\mathcal {H}}_P^k\) is such that \((\varnothing , h^k) = (\tilde{h}^k, s(h^k) )\). Recall that \(s(h^k)\) is the most recent state in \(h^k\). Choose a state s. Then for each \(k \ge 0\),
where for each \(k \ge 0\), \(\mathbf {P}_k^\sigma (\cdot )\) is the probability measure on \(\mathcal {H}_P^k\) induced by \(\sigma \) and the initial state s. As W is bounded, as \(k \rightarrow \infty \),
The incentive compatibility of \(\sigma \) follows from the one-shot deviation principle. Therefore \(v_{s} \in E (s;\delta )\). \(\square \)
1.1.2 A.2 Proof of Proposition 2
Proof
In order to show \(E(\delta )\) is a fixed point of \(B(\cdot ; \delta )\), it suffices to prove \(E(\delta ) \subseteq B(E(\delta );\delta )\) since this will also imply \(B(E(\delta ); \delta ) \subseteq E(\delta )\) by Lemma 2. Let \(v \in E(\delta )\) and \(\sigma \in \Sigma _P\) be the corresponding public strategy profile. Then for each state s,
As \(\sigma \) is a PPE, for each y and t, \(U (\sigma |_{(s,y)};t ) \in E (t;\delta )\), and any player i does not have an incentive to deviate from \(\sigma _i (s)\). Thus, \(v_s \in B_s (E(\delta );\delta )\); and so \(v \in B (E(\delta );\delta )\).
Suppose \(W \subseteq \mathbb {R}^{ |S| \times N}\) is a bounded fixed point of \(B(\cdot ;\delta )\). Then, \(W = B(W;\delta ) \subseteq E(\delta )\) by Lemma 2. This implies that \(E(\delta )\) is the largest bounded fixed point of B. \(\square \)
Rights and permissions
About this article
Cite this article
Kim, D. Comparison of information structures in stochastic games with imperfect public monitoring. Int J Game Theory 48, 267–285 (2019). https://doi.org/10.1007/s00182-018-0643-9
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00182-018-0643-9
Keywords
- Stochastic game
- Blackwell sufficiency
- Information structure
- Public monitoring
- Perfect public equilibrium