Comparison of information structures in stochastic games with imperfect public monitoring

Kim, Daehyun

doi:10.1007/s00182-018-0643-9

Comparison of information structures in stochastic games with imperfect public monitoring

Original Paper
Published: 16 October 2018

Volume 48, pages 267–285, (2019)
Cite this article

International Journal of Game Theory Aims and scope Submit manuscript

Daehyun Kim ORCID: orcid.org/0000-0002-8956-9644¹

430 Accesses
Explore all metrics

Abstract

This paper studies the impact of an improvement of information structure upon the perfect public equilibrium payoff set in discounted stochastic games with imperfect public monitoring. We first suggest three partial orders on information structures in stochastic games. Although each of them reduces to the notion of garbling in repeated games (Kandori in Rev Econ Stud 59:581–593, 1992), we find that an improvement of information in terms of our two garbling notions does not imply an expansion of the equilibrium payoff set for some games. We also show that more informativeness in terms of our third notion of garbling is sufficient for the expansion, thereby extending the well-known monotonicity result in Kandori (1992) to stochastic games.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Efficient outcomes in repeated games with limited monitoring

Article 24 June 2015

Strong robustness to incomplete information and the uniqueness of a correlated equilibrium

Article 12 November 2020

Subgame Consistency in Randomly-Furcating Cooperative Stochastic Differential Games

Discover the latest articles and news from researchers in related subjects, suggested using machine learning.

Notes

Kandori (1992) originally calls it quasi-garbling.
Kandori’s monotonicity result is originally about the pure strategy sequential equilibrium payoff set; our result is with respect to the perfect public equilibrium payoff set. He assumes full support of public monitoring signals, which implies that the set of pure strategy sequential equilibrium payoffs is identical to that of pure PPE payoffs; whereas we do not assume full support. We allow mixed public strategies.
See also Kandori and Obara (2006b) and Kamada and Kominers (2010). These papers point out that the equilibrium payoff set in repeated games could be larger with a less informative monitoring structure with different notions of informativeness.
See, for example, Hirshleifer (1971). Pęski (2008) [see also Gossner and Mertens (2001)] considers a smaller class of games, i.e., zero-sum games, and thereby obtains an analogous result to Blackwell (1951). Gossner (2000) provides a notion of informativeness for comparing the set of correlated equilibrium outcomes (and Bayes Nash equilibrium in games with incomplete information). See also Lehrer et al. (2013) for further results in this line.
For this reason, it is immaterial in our analysis which of the two is first observed by players.
See, for example, Hörner et al. (2011) or Fudenberg and Yamamoto (2011).
For a set B, $\Delta (B)$ denotes the set of probability distributions on B.
We may interpret player i’s (ex-ante) utility function as it is derived from an ex-post utility function $u_i : S \times A_i \times Y \rightarrow \mathbb {R}$:
$$\begin{aligned}g_i (s, a) \equiv \sum _{ ( t, y ) \in S \times Y} p(t,y|s,a) u_i (s,a_i, y)\end{aligned}$$
for each $(s,a) \in S \times A$. Note that observing a realization of the ex-post payoff does not give further information about $a_{-i}$ given they observe s and y.
This assumption is trivially satisfied when players’ ex-post utility function does not depend on a public signal. In this case, the signal is employed purely for monitoring.
$\mathbb {I}_A$ is the indicator function, i.e. $\mathbb {I}_A= 1$ if A is true, and 0 otherwise.
Given $\Pi = (Y,f)$, for $\xi : S \times Y \rightarrow \mathbb {R}^N$, let $\mathbb {E}^\Pi [ \xi (t,y) | s, \alpha ] \equiv \sum _{ t \in S, y \in Y, a \in A} \alpha (a) f (y|t,s,a) q( t | s,a) \xi (t,y)$ for all mixed action profile $\alpha $ and state s.
Since there is always chance of transition to $s_2$, if g is too big there is no $\delta $ to satisfy the inequality.
Given a set A, co(A) denotes the convex hull of A.
Proposition 2 is proved without a public randomization device, but it is easy to extend to allow for public randomization.
Yamamoto (2016) studies this environment with perfect monitoring and provides a folk theorem. When monitoring is perfect, a recursive characterization is still available as we could regard the public belief for the state as a state.

References

Blackwell D (1951) Comparison of experiments. Proc Second Berkeley Symp Math Stat Probab 1:93–102
Google Scholar
Dilip A, David P, Ennio S (1990) Toward a theory of discounted repeated games with imperfect monitoring. Econometrica 58(5):1041–1063
Article Google Scholar
Drew F, David L, Eric M (1994) The folk theorem with imperfect public information. Econometrica 62(5):997–1039
Article Google Scholar
Fudenberg D, Yamamoto Y (2011) The folk theorem for irreducible stochastic games with imperfect public monitoring. J Econ Theory 146:1664–1683
Article Google Scholar
Gossner O (2000) Comparison of information structures. Games Econ Behav 30:44–63
Article Google Scholar
Hirshleifer J (1971) The private and social value of information and the reward to inventive activity. Am Econ Rev 61:561–574
Google Scholar
Hörner J, Sugaya T, Takahashi S, Vieille N (2011) Recursive methods in discounted stochastic games: an algorithm for $\delta \rightarrow 1$ and a folk theorem. Econometrica 79:1277–1318
Article Google Scholar
Kamada Y, Kominers SD (2010) Information can wreck cooperation: a counterpoint to kandori (1992). Econ Lett 107:112–114
Article Google Scholar
Kandori M (1992) The use of information in repeated games with imperfect monitoring. Rev Econ Stud 59:581–593
Article Google Scholar
Kandori M, Obara I (2006b) Less is more: an observability paradox in repeated games. Int J Game Theory 34:475–493
Article Google Scholar
Kloosterman A (2015) Public information in markov games. J Econ Theory 157:28–48
Article Google Scholar
Lehrer E, Rosenberg D, Shmaya E (2013) Garbling of signals and outcome equivalence. Games Econ Behav 81:179–191
Article Google Scholar
Mailath GJ, Matthews SA, Sekiguchi T (2002) Private strategies in finitely repeated games with imperfect public monitoring. Contrib Theoret Econ. https://doi.org/10.2202/1534-5971.1046
Michihiro K, Ichiro O (2006a) Efficiency in repeated games revisited: the role of private strategies. Econometrica 74(2):499–519
Article Google Scholar
Olivier G, Jean-Francois M (2001) The value of information in zero-sum games (preprint)
Pęski M (2008) Comparison of information structures in zero-sum games. Games Econ Behav 62:732–735
Article Google Scholar
Sugaya T, Wolitzky A (2017) Bounding equilibrium payoffs in repeated games with private monitoring. Theoret Econ 12:691–729
Article Google Scholar
Yuichi Y (2016) Stochastic games with hidden states (Unpublished)

Download references

Author information

Authors and Affiliations

Department of Economics, UCLA, Los Angeles, CA, USA
Daehyun Kim

Authors

Daehyun Kim
View author publications
You can also search for this author inPubMed Google Scholar

Corresponding author

Correspondence to Daehyun Kim.

Additional information

I owe many thanks to Ichiro Obara for his continuous guidance, encouragement, and invaluable advice. I am also grateful to a co-editor and two anonymous referees for insightful comments. All remaining errors are my own.

Appendix

1.1 A Omitted proofs

1.1.1 A.1 Proof of Lemma 2

Proof

Fix $\delta \in [0,1)$. Let $W \subseteq \mathbb {R}^{ |S| \times N}$ be self-generating. Then, for each $v = (v_1, \dots , v_{|S|}) \in W$ and $s \in S$, we can find $\alpha ^v_s \in \prod _{ i \in I} \Delta A_i$ and $w^v_s : Y \rightarrow W$ which decomposes $v_s$ for each s. If there are more than one such pairs, choose one among them arbitrarily.

Pick $v = (v_1, \dots , v_{|S|}) \in B(W)$. We want to show that for each state s, $v_s \in E (s; \delta )$. Let $\tilde{\mathcal {H}}_P^0 \equiv \{ \varnothing \}$ and $\tilde{\mathcal {H}}_P^k \equiv \{ \varnothing \} \times (S \times Y)^k$ for $k \ge 1$, i.e., the set of ex-ante public histories (that is, history before$s^k$ is realized). For each $k \ge 0$, define $\nu ^k: \tilde{\mathcal {H}}_P^k \rightarrow W$ recursively as follows: let $\nu ^0 ( \varnothing ) := v$ and, for each $k \ge 1$ and $\tilde{h}^k = (\tilde{h}^{k-1}, s, y)$ where $\tilde{h}^{k-1} \in \tilde{H}_P^{k-1}$, $s \in S$ and $y \in Y$,

$$\begin{aligned} \nu ^{k} (\tilde{h}^k) := w^{ \nu ^{k-1} (\tilde{h}^{k-1}) }_s (y) \in W. \end{aligned}$$

That is, $\nu ^k (\tilde{h}^k) \in \mathbb {R}^{|S| \times N}$ is the continuation payoff vector corresponding to ex-ante history $\tilde{h}^k$.

Define a public strategy as follows: for any $k \ge 0$ and (ex-post) public history $h^k \in \mathcal {H}_P^k$ (see Sect. 2)

$$\begin{aligned} \sigma ( h^k ) := \alpha ^{ \nu ^k (\tilde{h}^{k})}_{s(h^k)} \end{aligned}$$

where $\tilde{h^k} \in \tilde{\mathcal {H}}_P^k$ is such that $(\varnothing , h^k) = (\tilde{h}^k, s(h^k) )$. Recall that $s(h^k)$ is the most recent state in $h^k$. Choose a state s. Then for each $k \ge 0$,

$$\begin{aligned} \begin{aligned} v_s&= (1- \delta ) g ( s, \sigma ( s) ) + \delta \sum _{ s^1 , y^0 } p(s^1, y^0 | s, \sigma (s)) w^{v}_{s} (s^1, y^0)\\&= (1- \delta ) g(s, \sigma (s )) + \delta \sum _{ s^1 , y^0 } p(s^1, y^0 | s, \sigma (s))\\&\quad \times \left( (1- \delta ) g (s^1, \sigma (s, y^0, s^1)) + \delta \sum _{ s^2 , y^1} p(s^2, y^1 | s^1, \sigma (s, y^0, s^1)) w_{s^1}^{ \nu ^1 (\varnothing , s,y^0) }(s^2, y^1) \right) \\&= (1- \delta ) \sum _{ \tau =0}^{k} \delta ^\tau \sum _{h^\tau \in \mathcal {H}_P^\tau } g (s (h^\tau ), \sigma (h^\tau )) \mathbf {P}_\tau ^\sigma (h^\tau )\\&\quad + \delta ^{k+1} \sum _{h^{k+1} \in \mathcal {H}_P^{k+1}} \mathbf {P}_{k+1}^\sigma (h^{k+1}) \sum _{ s^{k+1}, y^{k}} p ( s^{k+1}, y^k | s (h^k), \sigma (h^k))w^{\nu ^k (\tilde{h}^{k}) }_{s (h^k) } ( s^{k+1}, y^{k}), \end{aligned} \end{aligned}$$

where for each $k \ge 0$, $\mathbf {P}_k^\sigma (\cdot )$ is the probability measure on $\mathcal {H}_P^k$ induced by $\sigma $ and the initial state s. As W is bounded, as $k \rightarrow \infty $,

$$\begin{aligned} \begin{aligned} v_s&= (1- \delta ) \sum _{ k=0}^{\infty } \delta ^k \sum _{h^k \in \mathcal {H}_P^k} g(s (h^k), \sigma (h^k )) \mathbf {P}_k^\sigma (h^k) \\&=(1- \delta ) \mathbb {E}_{ \mathbf {P}^\sigma } \left[ \sum _{k=0}^\infty \delta ^k g ( s (h^k), a (h^k) | s_0 = s \right] . \end{aligned} \end{aligned}$$

The incentive compatibility of $\sigma $ follows from the one-shot deviation principle. Therefore $v_{s} \in E (s;\delta )$. $\square $

1.1.2 A.2 Proof of Proposition 2

Proof

In order to show $E(\delta )$ is a fixed point of $B(\cdot ; \delta )$, it suffices to prove $E(\delta ) \subseteq B(E(\delta );\delta )$ since this will also imply $B(E(\delta ); \delta ) \subseteq E(\delta )$ by Lemma 2. Let $v \in E(\delta )$ and $\sigma \in \Sigma _P$ be the corresponding public strategy profile. Then for each state s,

$$\begin{aligned} v_s = U (\sigma ; s) = (1- \delta ) g (s, \sigma (s)) + \delta \sum _{ t \in S, y \in Y} U ( \sigma |_{ (s, y)};t) p(t,y |s, \sigma (s)). \end{aligned}$$

As $\sigma $ is a PPE, for each y and t, $U (\sigma |_{(s,y)};t ) \in E (t;\delta )$, and any player i does not have an incentive to deviate from $\sigma _i (s)$. Thus, $v_s \in B_s (E(\delta );\delta )$; and so $v \in B (E(\delta );\delta )$.

Suppose $W \subseteq \mathbb {R}^{ |S| \times N}$ is a bounded fixed point of $B(\cdot ;\delta )$. Then, $W = B(W;\delta ) \subseteq E(\delta )$ by Lemma 2. This implies that $E(\delta )$ is the largest bounded fixed point of B. $\square $

Rights and permissions

Reprints and permissions

About this article

Cite this article

Kim, D. Comparison of information structures in stochastic games with imperfect public monitoring. Int J Game Theory 48, 267–285 (2019). https://doi.org/10.1007/s00182-018-0643-9

Download citation

Accepted: 02 October 2018
Published: 16 October 2018
Issue Date: 06 March 2019
DOI: https://doi.org/10.1007/s00182-018-0643-9

Keywords

JEL Classification

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Comparison of information structures in stochastic games with imperfect public monitoring

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Efficient outcomes in repeated games with limited monitoring

Strong robustness to incomplete information and the uniqueness of a correlated equilibrium

Subgame Consistency in Randomly-Furcating Cooperative Stochastic Differential Games

Explore related subjects

Notes

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Appendix

Appendix

1.1 A Omitted proofs

1.1.1 A.1 Proof of Lemma 2

Proof

1.1.2 A.2 Proof of Proposition 2

Proof

Rights and permissions

About this article

Cite this article

Share this article

Keywords

JEL Classification

Subscribe and save

Buy Now