Skip to main content
Log in

Comparison of information structures in stochastic games with imperfect public monitoring

  • Original Paper
  • Published:
International Journal of Game Theory Aims and scope Submit manuscript

Abstract

This paper studies the impact of an improvement of information structure upon the perfect public equilibrium payoff set in discounted stochastic games with imperfect public monitoring. We first suggest three partial orders on information structures in stochastic games. Although each of them reduces to the notion of garbling in repeated games (Kandori in Rev Econ Stud 59:581–593, 1992), we find that an improvement of information in terms of our two garbling notions does not imply an expansion of the equilibrium payoff set for some games. We also show that more informativeness in terms of our third notion of garbling is sufficient for the expansion, thereby extending the well-known monotonicity result in Kandori (1992) to stochastic games.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1

Similar content being viewed by others

Notes

  1. Kandori (1992) originally calls it quasi-garbling.

  2. Kandori’s monotonicity result is originally about the pure strategy sequential equilibrium payoff set; our result is with respect to the perfect public equilibrium payoff set. He assumes full support of public monitoring signals, which implies that the set of pure strategy sequential equilibrium payoffs is identical to that of pure PPE payoffs; whereas we do not assume full support. We allow mixed public strategies.

  3. See also Kandori and Obara (2006b) and Kamada and Kominers (2010). These papers point out that the equilibrium payoff set in repeated games could be larger with a less informative monitoring structure with different notions of informativeness.

  4. See, for example, Hirshleifer (1971). Pęski (2008) [see also Gossner and Mertens (2001)] considers a smaller class of games, i.e., zero-sum games, and thereby obtains an analogous result to Blackwell (1951). Gossner (2000) provides a notion of informativeness for comparing the set of correlated equilibrium outcomes (and Bayes Nash equilibrium in games with incomplete information). See also Lehrer et al. (2013) for further results in this line.

  5. For this reason, it is immaterial in our analysis which of the two is first observed by players.

  6. See, for example, Hörner et al. (2011) or Fudenberg and Yamamoto (2011).

  7. For a set B, \(\Delta (B)\) denotes the set of probability distributions on B.

  8. We may interpret player i’s (ex-ante) utility function as it is derived from an ex-post utility function \(u_i : S \times A_i \times Y \rightarrow \mathbb {R}\):

    $$\begin{aligned}g_i (s, a) \equiv \sum _{ ( t, y ) \in S \times Y} p(t,y|s,a) u_i (s,a_i, y)\end{aligned}$$

    for each \((s,a) \in S \times A\). Note that observing a realization of the ex-post payoff does not give further information about \(a_{-i}\) given they observe s and y.

  9. This assumption is trivially satisfied when players’ ex-post utility function does not depend on a public signal. In this case, the signal is employed purely for monitoring.

  10. \(\mathbb {I}_A\) is the indicator function, i.e. \(\mathbb {I}_A= 1\) if A is true, and 0 otherwise.

  11. Given \(\Pi = (Y,f)\), for \(\xi : S \times Y \rightarrow \mathbb {R}^N\), let \(\mathbb {E}^\Pi [ \xi (t,y) | s, \alpha ] \equiv \sum _{ t \in S, y \in Y, a \in A} \alpha (a) f (y|t,s,a) q( t | s,a) \xi (t,y)\) for all mixed action profile \(\alpha \) and state s.

  12. Since there is always chance of transition to \(s_2\), if g is too big there is no \(\delta \) to satisfy the inequality.

  13. Given a set A, co(A) denotes the convex hull of A.

  14. Proposition 2 is proved without a public randomization device, but it is easy to extend to allow for public randomization.

  15. Yamamoto (2016) studies this environment with perfect monitoring and provides a folk theorem. When monitoring is perfect, a recursive characterization is still available as we could regard the public belief for the state as a state.

References

  • Blackwell D (1951) Comparison of experiments. Proc Second Berkeley Symp Math Stat Probab 1:93–102

    Google Scholar 

  • Dilip A, David P, Ennio S (1990) Toward a theory of discounted repeated games with imperfect monitoring. Econometrica 58(5):1041–1063

    Article  Google Scholar 

  • Drew F, David L, Eric M (1994) The folk theorem with imperfect public information. Econometrica 62(5):997–1039

    Article  Google Scholar 

  • Fudenberg D, Yamamoto Y (2011) The folk theorem for irreducible stochastic games with imperfect public monitoring. J Econ Theory 146:1664–1683

    Article  Google Scholar 

  • Gossner O (2000) Comparison of information structures. Games Econ Behav 30:44–63

    Article  Google Scholar 

  • Hirshleifer J (1971) The private and social value of information and the reward to inventive activity. Am Econ Rev 61:561–574

    Google Scholar 

  • Hörner J, Sugaya T, Takahashi S, Vieille N (2011) Recursive methods in discounted stochastic games: an algorithm for \(\delta \rightarrow 1\) and a folk theorem. Econometrica 79:1277–1318

    Article  Google Scholar 

  • Kamada Y, Kominers SD (2010) Information can wreck cooperation: a counterpoint to kandori (1992). Econ Lett 107:112–114

    Article  Google Scholar 

  • Kandori M (1992) The use of information in repeated games with imperfect monitoring. Rev Econ Stud 59:581–593

    Article  Google Scholar 

  • Kandori M, Obara I (2006b) Less is more: an observability paradox in repeated games. Int J Game Theory 34:475–493

    Article  Google Scholar 

  • Kloosterman A (2015) Public information in markov games. J Econ Theory 157:28–48

    Article  Google Scholar 

  • Lehrer E, Rosenberg D, Shmaya E (2013) Garbling of signals and outcome equivalence. Games Econ Behav 81:179–191

    Article  Google Scholar 

  • Mailath GJ, Matthews SA, Sekiguchi T (2002) Private strategies in finitely repeated games with imperfect public monitoring. Contrib Theoret Econ. https://doi.org/10.2202/1534-5971.1046

  • Michihiro K, Ichiro O (2006a) Efficiency in repeated games revisited: the role of private strategies. Econometrica 74(2):499–519

    Article  Google Scholar 

  • Olivier G, Jean-Francois M (2001) The value of information in zero-sum games (preprint)

  • Pęski M (2008) Comparison of information structures in zero-sum games. Games Econ Behav 62:732–735

    Article  Google Scholar 

  • Sugaya T, Wolitzky A (2017) Bounding equilibrium payoffs in repeated games with private monitoring. Theoret Econ 12:691–729

    Article  Google Scholar 

  • Yuichi Y (2016) Stochastic games with hidden states (Unpublished)

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Daehyun Kim.

Additional information

I owe many thanks to Ichiro Obara for his continuous guidance, encouragement, and invaluable advice. I am also grateful to a co-editor and two anonymous referees for insightful comments. All remaining errors are my own.

Appendix

Appendix

1.1 A Omitted proofs

1.1.1 A.1 Proof of Lemma 2

Proof

Fix \(\delta \in [0,1)\). Let \(W \subseteq \mathbb {R}^{ |S| \times N}\) be self-generating. Then, for each \(v = (v_1, \dots , v_{|S|}) \in W\) and \(s \in S\), we can find \(\alpha ^v_s \in \prod _{ i \in I} \Delta A_i\) and \(w^v_s : Y \rightarrow W\) which decomposes \(v_s\) for each s. If there are more than one such pairs, choose one among them arbitrarily.

Pick \(v = (v_1, \dots , v_{|S|}) \in B(W)\). We want to show that for each state s, \(v_s \in E (s; \delta )\). Let \(\tilde{\mathcal {H}}_P^0 \equiv \{ \varnothing \}\) and \(\tilde{\mathcal {H}}_P^k \equiv \{ \varnothing \} \times (S \times Y)^k\) for \(k \ge 1\), i.e., the set of ex-ante public histories (that is, history before\(s^k\) is realized). For each \(k \ge 0\), define \(\nu ^k: \tilde{\mathcal {H}}_P^k \rightarrow W\) recursively as follows: let \(\nu ^0 ( \varnothing ) := v\) and, for each \(k \ge 1\) and \(\tilde{h}^k = (\tilde{h}^{k-1}, s, y)\) where \(\tilde{h}^{k-1} \in \tilde{H}_P^{k-1}\), \(s \in S\) and \(y \in Y\),

$$\begin{aligned} \nu ^{k} (\tilde{h}^k) := w^{ \nu ^{k-1} (\tilde{h}^{k-1}) }_s (y) \in W. \end{aligned}$$

That is, \(\nu ^k (\tilde{h}^k) \in \mathbb {R}^{|S| \times N}\) is the continuation payoff vector corresponding to ex-ante history \(\tilde{h}^k\).

Define a public strategy as follows: for any \(k \ge 0\) and (ex-post) public history \(h^k \in \mathcal {H}_P^k\) (see Sect. 2)

$$\begin{aligned} \sigma ( h^k ) := \alpha ^{ \nu ^k (\tilde{h}^{k})}_{s(h^k)} \end{aligned}$$

where \(\tilde{h^k} \in \tilde{\mathcal {H}}_P^k\) is such that \((\varnothing , h^k) = (\tilde{h}^k, s(h^k) )\). Recall that \(s(h^k)\) is the most recent state in \(h^k\). Choose a state s. Then for each \(k \ge 0\),

$$\begin{aligned} \begin{aligned} v_s&= (1- \delta ) g ( s, \sigma ( s) ) + \delta \sum _{ s^1 , y^0 } p(s^1, y^0 | s, \sigma (s)) w^{v}_{s} (s^1, y^0)\\&= (1- \delta ) g(s, \sigma (s )) + \delta \sum _{ s^1 , y^0 } p(s^1, y^0 | s, \sigma (s))\\&\quad \times \left( (1- \delta ) g (s^1, \sigma (s, y^0, s^1)) + \delta \sum _{ s^2 , y^1} p(s^2, y^1 | s^1, \sigma (s, y^0, s^1)) w_{s^1}^{ \nu ^1 (\varnothing , s,y^0) }(s^2, y^1) \right) \\&= (1- \delta ) \sum _{ \tau =0}^{k} \delta ^\tau \sum _{h^\tau \in \mathcal {H}_P^\tau } g (s (h^\tau ), \sigma (h^\tau )) \mathbf {P}_\tau ^\sigma (h^\tau )\\&\quad + \delta ^{k+1} \sum _{h^{k+1} \in \mathcal {H}_P^{k+1}} \mathbf {P}_{k+1}^\sigma (h^{k+1}) \sum _{ s^{k+1}, y^{k}} p ( s^{k+1}, y^k | s (h^k), \sigma (h^k))w^{\nu ^k (\tilde{h}^{k}) }_{s (h^k) } ( s^{k+1}, y^{k}), \end{aligned} \end{aligned}$$

where for each \(k \ge 0\), \(\mathbf {P}_k^\sigma (\cdot )\) is the probability measure on \(\mathcal {H}_P^k\) induced by \(\sigma \) and the initial state s. As W is bounded, as \(k \rightarrow \infty \),

$$\begin{aligned} \begin{aligned} v_s&= (1- \delta ) \sum _{ k=0}^{\infty } \delta ^k \sum _{h^k \in \mathcal {H}_P^k} g(s (h^k), \sigma (h^k )) \mathbf {P}_k^\sigma (h^k) \\&=(1- \delta ) \mathbb {E}_{ \mathbf {P}^\sigma } \left[ \sum _{k=0}^\infty \delta ^k g ( s (h^k), a (h^k) | s_0 = s \right] . \end{aligned} \end{aligned}$$

The incentive compatibility of \(\sigma \) follows from the one-shot deviation principle. Therefore \(v_{s} \in E (s;\delta )\). \(\square \)

1.1.2 A.2 Proof of Proposition 2

Proof

In order to show \(E(\delta )\) is a fixed point of \(B(\cdot ; \delta )\), it suffices to prove \(E(\delta ) \subseteq B(E(\delta );\delta )\) since this will also imply \(B(E(\delta ); \delta ) \subseteq E(\delta )\) by Lemma 2. Let \(v \in E(\delta )\) and \(\sigma \in \Sigma _P\) be the corresponding public strategy profile. Then for each state s,

$$\begin{aligned} v_s = U (\sigma ; s) = (1- \delta ) g (s, \sigma (s)) + \delta \sum _{ t \in S, y \in Y} U ( \sigma |_{ (s, y)};t) p(t,y |s, \sigma (s)). \end{aligned}$$

As \(\sigma \) is a PPE, for each y and t, \(U (\sigma |_{(s,y)};t ) \in E (t;\delta )\), and any player i does not have an incentive to deviate from \(\sigma _i (s)\). Thus, \(v_s \in B_s (E(\delta );\delta )\); and so \(v \in B (E(\delta );\delta )\).

Suppose \(W \subseteq \mathbb {R}^{ |S| \times N}\) is a bounded fixed point of \(B(\cdot ;\delta )\). Then, \(W = B(W;\delta ) \subseteq E(\delta )\) by Lemma 2. This implies that \(E(\delta )\) is the largest bounded fixed point of B. \(\square \)

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Kim, D. Comparison of information structures in stochastic games with imperfect public monitoring. Int J Game Theory 48, 267–285 (2019). https://doi.org/10.1007/s00182-018-0643-9

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00182-018-0643-9

Keywords

JEL Classification

Navigation