Abstract
The computation of subgame perfect equilibrium in stationary strategies is an important but challenging problem in applications of stochastic games. In 2004, Herings and Peeters developed a homotopy method called stochastic linear tracing procedure to solve this problem. However, the starting point of their method requires to be explicitly calculated. To remedy this issue, we formulate an arbitrary starting linear tracing procedure in this paper. By introducing a homotopy variable ranging from two to zero, an artificial penalty game is developed, whose solutions construct a differentiable path after a well-chosen transformation of variables. The starting point of the path can be arbitrarily chosen, so that there is no need to employ additional algorithms to obtain it. Following the path, one can readily attain the “starting point” of the stochastic tracing procedure coined by Herings and Peeters. Then, as the homotopy variable changes from one to zero, the path essentially resumes to the stochastic tracing procedure. We prove that our method globally converges to a subgame perfect equilibrium in stationary strategies for the stochastic game of interest. Numerical results further illustrate the effectiveness and efficiency of our method.
Similar content being viewed by others
Notes
Interested readers may refer to [16] for a comprehension about the value iteration algorithm and policy iteration algorithm for solving Markov decision problems.
\(\beta _0\) is chosen as \(10^{-12}\) or even smaller in our numerical experiments. We introduce \(\beta _0\) here just to guarantee the differentiability of \(\nu (t)\) and our homotopy, which will be discussed later.
The homotopy variable t in SLTP changes from zero to one, while the direction of t in our method is opposite.
Interested readers can refer to [25] for a comprehension about the transversality theorem.
References
Chen, Y., Dang, C.: A differentiable homotopy method to compute perfect equilibria. Math. Program. 2, (2019). https://doi.org/10.1007/s10107-019-01422-y
Chen, Y., Dang, C.: An extension of quantal response equilibrium and determination of perfect equilibrium. Games Econ. Behav. (2018). https://doi.org/10.1016/j.geb.2017.12.023
Herings, J.J., Peeters, R.: A globally convergent algorithm to compute all nash equilibria for \(n\)-person games. Ann. Oper. Res. 137(1), 349–368 (2005)
Shapley, L.: Stochastic gamesproceedings of the national academy of sciences of the USA 39, 1095–1100 (chapter 1 in this volume). MathSciNet zbMATH (1953)
Adlakha, S., Johari, R., Weintraub, G.Y.: Equilibria of dynamic games with many players: existence, approximation, and market structure. J. Econ. Theory 156, 269–316 (2015)
Maskin, E., Tirole, J.: Markov perfect equilibrium. I. Observable actions. J. Econ. Theory 100(2), 191–219 (2001)
Fink, A.M., et al.: Equilibrium in a stochastic \(n\)-person game. J. Sci. Hiroshima Univ. Series ai (mathematics) 28(1), 89–93 (1964)
Sobel, M.J.: Noncooperative stochastic games. Ann. Math. Stat. 42(6), 1930–1935 (1971)
Takahashi, M., et al.: Equilibrium points of stochastic non-cooperative \( n \)-person games. J. Sci. Hiroshima Univ. Ser. AI (Mathematics) 28(1), 95–99 (1964)
Eaves, B.C.: Homotopies for computation of fixed points. Math. Program. 3(1), 1–22 (1972)
Scarf, H.: The approximation of fixed points of a continuous mapping. SIAM J. Appl. Math. 15(5), 1328–1343 (1967)
Dreves, A.: How to select a solution in generalized nash equilibrium problems. J. Optim. Theory Appl. 178, 973–997 (2018). https://doi.org/10.1007/s10957-018-1327-0
Dang, C.: Simplicial methods for approximating fixed point with applications in combinatorial optimizations. In: Handbook of Combinatorial Optimization, pp. 3015–3056 (2013)
Zhan, Y., Dang, C.: A smooth path-following algorithm for market equilibrium under a class of piecewise-smooth concave utilities. Comput. Optim. Appl. 71(2), 381–402 (2018)
Herings, P.J.J., Peeters, R.J.: Stationary equilibria in stochastic games: structure, selection, and computation. J. Econ. Theory 118, 32–60 (2004)
Puterman, M.L.: Markov Decision Processes: Discrete Stochastic Dynamic Programming. Wiley, New York (1994)
Fudenberg, D., Tirole, J.: Game Theory, vol. 393. MIT Press, Cambridge (1991)
Parthasarathy, T., Raghavan, T.E.S.: An orderfield property for stochastic games when one player controls transition probabilities. J. Optim. Theory Appl. 33(3), 375–392 (1981)
Corbae, D., Stinchcombe, M.B., Zeman, J.: An Introduction to Mathematical Analysis for Economic Theory and Econometrics. Princeton University Press, Princeton (2009)
Browder, F.E.: On continuity of fixed points under deformations of continuous mappings. Summa Brasiliensis Mathematicae 4, 183–191 (1960)
Herings, J.J.: Two simple proofs of the feasibility of the linear tracing procedure. Econ. Theory 15(2), 485–490 (2000)
Herings, P.J.J., Peeters, R.J.: A differentiable homotopy to compute nash equilibria of \(n\)-person games. Econ. Theory 18(1), 159–185 (2001)
Lemke, C.: Pathways to Solutions, Fixed-points, and Equilibria. Prentice-Hall, Upper Saddle River (1983)
Schanuel, S.H., Simon, L.K., Zame, W.R.: The algebraic geometry of games and the tracing procedure. In: Game Equilibrium Models II, pp. 9–43. Springer, New York (1991)
Eaves, B.C., Schmedders, K.: General equilibrium models and homotopy methods. J. Econ. Dyn. Control 23(9–10), 1249–1279 (1999)
Allgower, E.L., Georg, K.: Introduction to Numerical Continuation Methods, vol. 45. SIAM, Philadelphia (2003)
Acknowledgements
This work was partially supported by National Nature Science Foundation of China (61976184).
Author information
Authors and Affiliations
Corresponding author
Additional information
Communicated by Francesco Zirilli.
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Appendices
Appendices
A: Proof of Theorem 3.2
In this appendix, we intend to prove that zero is a regular value of \(q(y,\mu ,t;\alpha )\) when \(0< t\le 2\). This result is applied in the proof of Theorem 3.2.
First, we consider the case that \(0<t<2\). For simplicity, we rewrite q as
where
and
Let \(v:=(y,\mu ,t;\alpha )\in \mathbb R^m\times \varLambda \times (0,2) \times {\mathbb {R}}^{m}\). Then, the Jacobian matrix of q is a \((m+nd)\times (2m+nd+1)\) matrix, which can be written as
where \(\dfrac{\partial q^1}{\partial y}\in {\mathbb {R}}^{m}\times {\mathbb {R}}^m\), \(\dfrac{\partial q^1}{\partial \mu }\in {\mathbb {R}}^m \times {\mathbb {R}}^{nd}\), and \(\dfrac{\partial q^1}{\partial t} \in {\mathbb {R}}^m \times {\mathbb {R}}^1\). Besides, \(\dfrac{\partial q^1}{\partial \alpha }=-t(2-t) \mathbf{I _{\varvec{m}}}\), where \(\mathbf{I _{\varvec{m}}}\) is an \(m\times m\) identity matrix. Thus, when \(0<t<2\), \(\dfrac{\partial q^1}{\partial \alpha }\) is a diagonal matrix with full rank. Let \(r_{\omega j}^i:=\max \{0,y_{\omega j}^i\}\) and \(r_{\omega }^i\) be a vector \((r_{\omega 1}^i, r_{\omega 2}^i,\ldots ,r_{\omega m_{\omega }^i}^i)\). Then, \(\dfrac{\partial q^2}{\partial y}=2\times \mathrm {diag} (r_{\omega }^i)\) is a \((nd\times m)\) matrix. Obviously, \(\dfrac{\partial q^2}{\partial y}\) is of full-row rank, because for any pair of i and \(\omega \), there must be at least one positive \(r_{\omega j}^i\)\((j\in M_{\omega }^i)\), and any two row vectors of the matrices are linearly independent. It follows from the basic operations of matrices that the Jacobian matrix J is of full-row rank.
Second, we discuss the case that \(t=2\), where the Jacobian matrix is defined as \(J_1\). Specifically speaking,
where
is a diagonal matrix with full rank, and \(J_{12}=\text {diag}(e_{m_{\omega }^i})\in {\mathbb {R}}^{m\times nd}\) with \(e_{m_{\omega }^i}=(1,\ldots ,1)^{\top }\) being a \(m_{\omega }^i\)-dimensional column vector and \(\sum \limits _{i\in N}\sum \limits _{\omega \in \varOmega }m_{\omega }^i=m\). Hence, \(J_{12}\) is of full-column rank. It has been proved that \(\dfrac{\partial q^2}{\partial y}\) is a full-row rank matrix. Then, \(J_1\) is of full-rank. This completes the proof. \(\square \)
B: The Compactness of \(\varPhi \)
In this section, we prove the compactness of \(\varPhi \). This result is used in Corollary 3.1. Let \(S_0\) be the set of all \((x,\lambda ,\mu ,t)\) satisfying the system (9). Now, we show that \(S_0\) is bounded. Because \(x_{\omega j}^i\) is a probability with \(0\le x_{\omega j}^i\le 1\), \(x_{\omega j}^i\) is bounded, where \(j\in M_{\omega }^i\), \(\omega \in \varOmega \) and \(i\in N\). Let \(u^{i+}:=(u^{i+}_{\omega })_{\omega \in \varOmega }\) and \(\pi ^+:=(\pi ^+({\bar{\omega }}|\omega ))_{{\bar{\omega }},\omega \in \varOmega }\) with
Rewriting Eq. (7) in a vector form, we get that
Because \((1-(1-\zeta (t))\delta \pi ^+)>0\), we get the boundedness of \(\mu \) that
Moreover, from the first group of equations of the system (9), the boundedness of \(\lambda _{\omega j}^i\) is obtained subsequently. Let \(X:=\{x\in {\mathbb {R}}^m_+\;:\; 0\le x\le 1\}\) and \(\varDelta :=\{\lambda \in {\mathbb {R}}^m_+\;:\; \lambda _0\le \lambda \le \lambda _1 \}\), where \(\lambda _0\) and \(\lambda _1\) are the lower bound and upper bound of \(\lambda \), respectively. Similarly, we define \(\varLambda :=\{\mu \in \mathbb R^{nd}\;:\;\mu _0\le \mu \le \mu _1\}\), where \(\mu _0\) and \(\mu _1\) are the lower bound and upper bound of \(\mu \), respectively. Hence, \(S_0 \subset X\times \varDelta \times \varLambda \times [0,2]\) is a nonempty, convex and compact set. Then, the compactness of \(\varPhi \) is established immediately. \(\square \)
Rights and permissions
About this article
Cite this article
Li, P., Dang, C. An Arbitrary Starting Tracing Procedure for Computing Subgame Perfect Equilibria. J Optim Theory Appl 186, 667–687 (2020). https://doi.org/10.1007/s10957-020-01703-z
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10957-020-01703-z
Keywords
- Noncooperative stochastic games
- Subgame perfect equilibrium
- Linear tracing procedure
- Arbitrary starting