Abstract
We consider Gillette’s two-person zero-sum stochastic games with perfect information. For each \(k \in \mathbb {N}=\{0,1,\ldots \}\) we introduce an effective reward function, called k-total. For \(k = 0\) and 1 this function is known as mean payoff and total reward, respectively. We restrict our attention to the deterministic case. For all k, we prove the existence of a saddle point which can be realized by uniformly optimal pure stationary strategies. We also demonstrate that k-total reward games can be embedded into \((k+1)\)-total reward games.
Similar content being viewed by others
Notes
Following standard terminology, we will use vertices and arcs when we talk about graphs and positions and moves when we talk about games.
A history-dependent strategy is called Markovian if the move depends only on current time and current position (but not on the complete history).
That is, for every k-total reward game we can construct an equivalent \((k+1)\)-total reward game, i.e., solving the latter provides a solution to the former.
That is, the running time is bounded by a polynomial in n and R.
References
Björklund H, Vorobyov S (2005) Combinatorial structure and randomized subexponential algorithms for infinite games. Theor Comp Sci 349(3):347–360
Blackwell D (1962) Discrete dynamic programming. Ann Math Statist 33:719–726
Boros E, Elbassioni K, Gurvich V, Makino K (2012) On Nash equilibria and improvement cycles in pure positional strategies for chess-like and backgammon-like \(n\)-person games. Discret Math 312(4):772–788
Boros E, Elbassioni K, Gurvich V, Makino K (2013) On canonical forms for zero-sum stochastic mean payoff games. Dynamic Games Appl
Boros E, Elbassioni K, Gurvich V, Makino K (2015) Markov decision processes and stochastic games with total effective payoff. In: Mayr EW, Ollinger N, (eds), STACS, volume 30 of LIPIcs. pages 103–115. Schloss Dagstuhl - Leibniz-Zentrum für Informatik, 2015. The full version is available as Technical report RRR-4-2014, RUTCOR, Rutgers University
Ehrenfeucht A, Mycielski J (1979) Positional strategies for mean payoff games. Int J Game Theory 8:109–113
Filar Jerzy, Vrieze Koos (1996) Competitive Markov Decision processes. Springer-Verlag New York Inc, New York
Fulkerson DR, Harding GC (1977) Maximizing the minimum source-sink path subject to a budget constraint. Math Program 13:116–118
Gillette D (1957) Stochastic games with zero stop probabilities. In: Contributions to the Theory of Games, Vol. III, volume 39 of Annals of Mathematics Studies. Princeton, pp 179–187
Gimbert H, Zielonka W (2004) When can you play positionally? Mathematical Foundations of Computer Science 2004, vol 3153., Lecture Notes in Computer ScienceSpringer, Berlin Heidelberg, pp 686–697
Gould HW (2012) Combinatorial indentities, http://www.math.wvu.edu/~gould/
Gurvich V (1988) A stochastic game with complete information and without equilibrium situations in pure stationary strategies. Russian Math Surv 43(2):171–172
Gurvich VA, Karzanov AV, Khachiyan LG (1988) Cyclic games and an algorithm to find minimax cycle means in directed graphs. USSR Comput Math Math Phys 28:8591
Halman N (2007) Simple stochastic games, parity games, mean payoff games and discounted payoff games are all LP-type problems. Algorithmica 49(1):37–50
Hardy GH, Littlewood JE (1931) Notes on the theory of series (xvi): two tauberian theorems. J London Math Soc 6:281–286
Israeli E, Wood RK (2002) Shortest-path network interdiction. Networks 40(2):97–111
Karzanov AV, Lebedev VN (1993) Cyclical games with prohibition. Math Program 60:277–293
Khachiyan L, Boros E, Borys K, Elbassioni K, Gurvich V, Rudolf G, Zhao J (2008) On short paths interdiction problems: Total and node-wise limited interdiction. Theory Comput Syst 43(2):204–233
Khachiyan L, Gurvich V, Zhao J (2006) Extending dijkstra’s algorithm to maximize the shortest path by node-wise limited arc interdiction. In: CSR, pp 221–234
Liggett TM, Lippman SA (1969) Stochastic games with perfect information and time average payoff. SIAM Rev 11(4):604–607
Mertens JF, Neyman A (1981) Stochastic games. Int J Game Theory 10:53–66
Mine H, Osaki S (1970) Markovian decision process. American Elsevier Publishing Co., New York
Möhring RH, Skutella M, Stork F (2004) Scheduling with and/or precedence constraints. SIAM J Comput 33(2):393–415
Moulin H (1976a) Extension of two person zero sum games. J Math Anal Appl 55(2):490–507
Moulin H (1976b) Prolongement des jeux à deux joueurs de somme nulle. une théorie abstraite des duels. Mémoires de la Soc Math France 45:5–111
Neyman A, Sorin S (2003) Stochastic Games and Applications. Kluwer Academic Publishers, NATO ASI series edition
Pisaruk NN (1999) Mean cost cyclical games. Math Operat Res 24(4):817–828
Puterman ML (ed) (2005) Markov decision processes—discrete stochastic programming. Wiley Inter-Science, USA
Shapley L (1953) Stochastic games. Proc Nat Acad Sci USA 39:1095–1100
Thuijsman F, Vrieze OJ (1987) The bad match, a total reward stochastic game. Oper Res Spektrum 9:93–99
Thuijsman F, Vrieze OJ (1998) Total reward stochastic games and sensitive average reward strategies. J Optim Theory Appl 98:175–196
Vorobyov S (2008) Cyclic games and linear programming. Dis Appl Math 156(11):2195–2231
Zwick U, Paterson M (1996) The complexity of mean payoff games on graphs. Theor Comp Sci 158(1–2):343–359
Acknowledgments
We thank the two anonymous reviewers for the careful reading and many helpful remarks. Part of this research was done at the Mathematisches Forschungsinstitut Oberwolfach during a stay within the Research in Pairs Program from July 26 to August 15, 2015. This research was partially supported by the Scientific Grant-in-Aid from Ministry of Education, Science, Sports and Culture of Japan. The first author also thanks the National Science Foundation (Grant IIS-1161476).
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Boros, E., Elbassioni, K., Gurvich, V. et al. A nested family of \(\varvec{k}\)-total effective rewards for positional games. Int J Game Theory 46, 263–293 (2017). https://doi.org/10.1007/s00182-016-0532-z
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00182-016-0532-z