skip to main content
research-article

Partial-Observation Stochastic Games: How to Win when Belief Fails

Published: 02 May 2014 Publication History

Abstract

In two-player finite-state stochastic games of partial observation on graphs, in every state of the graph, the players simultaneously choose an action, and their joint actions determine a probability distribution over the successor states. The game is played for infinitely many rounds and thus the players construct an infinite path in the graph. We consider reachability objectives where the first player tries to ensure a target state to be visited almost-surely (i.e., with probability 1) or positively (i.e., with positive probability), no matter the strategy of the second player.
We classify such games according to the information and to the power of randomization available to the players. On the basis of information, the game can be one-sided with either (a) player 1, or (b) player 2 having partial observation (and the other player has perfect observation), or two-sided with (c) both players having partial observation. On the basis of randomization, (a) the players may not be allowed to use randomization (pure strategies), or (b) they may choose a probability distribution over actions but the actual random choice is external and not visible to the player (actions invisible), or (c) they may use full randomization.
Our main results for pure strategies are as follows: (1) For one-sided games with player 2 having perfect observation we show that (in contrast to full randomized strategies) belief-based (subset-construction based) strategies are not sufficient, and we present an exponential upper bound on memory both for almost-sure and positive winning strategies; we show that the problem of deciding the existence of almost-sure and positive winning strategies for player 1 is EXPTIME-complete and present symbolic algorithms that avoid the explicit exponential construction. (2) For one-sided games with player 1 having perfect observation we show that nonelementary memory is both necessary and sufficient for both almost-sure and positive winning strategies. (3) We show that for the general (two-sided) case finite-memory strategies are sufficient for both positive and almost-sure winning, and at least nonelementary memory is required. We establish the equivalence of the almost-sure winning problems for pure strategies and for randomized strategies with actions invisible. Our equivalence result exhibit serious flaws in previous results of the literature: we show a nonelementary memory lower bound for almost-sure winning whereas an exponential upper bound was previously claimed.

References

[1]
M. Abadi, L. Lamport, and P. Wolper. 1989. Realizable and unrealizable specifications of reactive systems. In Proceedings of the International Colloquium on Automata, Languages and Programming. Lecture Notes in Computer Science, vol. 372, Springer, 1--17.
[2]
R. Alur, T. A. Henzinger, and O. Kupferman. 2002. Alternating-time temporal logic. J. ACM 49, 672--713.
[3]
R. Aumann, 1995. Repeated Games with Incomplete Information. MIT Press.
[4]
C. Baier, N. Bertrand, and M. Grösser. 2008. On decision problems for probabilistic Büchi automata. In Proceedings of the International Conference on Foundations of Software Science and Computation Structures. Lecture Notes in Computer Science, vol. 4962, Springer, 287--301.
[5]
C. Baier, N. Bertrand, and M. Grösser. 2009. The effect of tossing coins in omega-automata. In Proceedings of the International Conference on Concurrency Theory. Lecture Notes in Computer Science, vol. 5710, Springer, 15--29.
[6]
C. Baier and M. Grösser. 2005. Recognizing omega-regular languages with probabilistic automata. In Proceedings of the Annual IEEE Symposium on Logic in Computer Science. 137--146.
[7]
N. Bertrand, B. Genest, and H. Gimbert. 2009. Qualitative determinacy and decidability of stochastic games with signals. In Proceedings of the Annual IEEE Symposium on Logic in Computer Science. 319--328.
[8]
D. Berwanger and L. Doyen. 2008. On the power of imperfect information. In Proceedings of the Conference on Foundations of Software Technology and Theoretical Computer Science. Dagstuhl Seminar Proceedings 08004. IBFI.
[9]
T. Brázdil, P. Jancar, and A. Kucera. 2010. Reachability games on extended vector addition systems with states. In Proceedings of the International Colloquium on Automata, Languages and Programming. Lecture Notes in Computer Science, vol. 6199, Springer, 478--489.
[10]
R. G. Bukharaev. 1980. Probabilistic automata. J. Math.l Sci. 13, 359--386.
[11]
P. Cerný, K. Chatterjee, T. A. Henzinger, A. Radhakrishna, and R. Singh. 2011. Quantitative synthesis for concurrent programs. In Proceedings of the International Conference on Computer Aided Verification. Lecture Notes in Computer Science, vol. 6806, Springer, 243--259.
[12]
R. Chadha, A. P. Sistla, and M. Viswanathan. 2009a. On the expressiveness and complexity of randomization in finite state monitors. J. ACM 56, 1--44.
[13]
R. Chadha, A. P. Sistla, and M. Viswanathan. 2009b. Power of randomization in automata on infinite strings. In Proceedings of the International Conference on Concurrency Theory. Lecture Notes in Computer Science, vol. 5710, Springer, 229--243.
[14]
R. Chadha, A. P. Sistla, and M. Viswanathan. 2010. Model checking concurrent programs with nondeterminism and randomization. In Proceedings of the Conference on Foundations of Software Technology and Theoretical Computer Science. LIPIcs Series, vol. 8, 364--375.
[15]
K. Chatterjee, M. Chmelik, and M. Tracol. 2013a. What is decidable about partially observable Markov decision processes with omega-regular objectives. In Proceedings of the International Workshop on Computer Science Logic.
[16]
K. Chatterjee and L. Doyen. 2010. The complexity of partial-observation parity games. In Proceedings of theInternational Conference on Logic Programming, Artificial Intelligence and Reasoning. Lecture Notes in Computer Science, vol. 6397, Springer, 1--14.
[17]
K. Chatterjee, L. Doyen, H. Gimbert, and T. A. Henzinger. 2010a. Randomness for free. In Proceedings of the International Symposium on. Mathematical Foundations of Computer Science. Lecture Notes in Computer Science, vol. 6281, Springer, 246--257.
[18]
K. Chatterjee, L. Doyen, and T. A. Henzinger. 2010b. Qualitative analysis of partially-observable Markov decision processes. In Proceedings of the International Symposium on. Mathematical Foundations of Computer Science. Lecture Notes in Computer Science, vol. 6281, Springer, 258--269.
[19]
K. Chatterjee, L. Doyen, and T. A. Henzinger. 2013b. A survey of partial-observation stochastic parity games. Formal Methods Syst. Design 43, 2, 268--284.
[20]
K. Chatterjee, L. Doyen, T. A. Henzinger, and J.-F. Raskin. 2007. Algorithms for omega-regular games of incomplete information. Logical Methods Computer Sci. 3, 3:4.
[21]
K. Chatterjee, L. Doyen, S. Nain, and M. Y. Vardi. 2013c. The complexity of partial-observation stochastic parity games with finite-memory strategies. Tech. rep., IST Austria. IST-2013-141.
[22]
K. Chatterjee and M. Tracol. 2012. Decidable problems for probabilistic automata on infinite words. In Proceedings of the Annual IEEE Symposium on Logic in Computer Science. 185--194.
[23]
A. Condon. 1992. The complexity of stochastic games. Inform. Comput. 96, 2, 203--224.
[24]
L. de Alfaro and T. A. Henzinger. 2001. Interface automata. In Proceedings of the ACM SIGSOFT International Symposium on Foundations of Software Engineering. ACM Press, 109--120.
[25]
L. de Alfaro, T. A. Henzinger, and O. Kupferman. 2007. Concurrent reachability games. Theor. Comput. Sci. 386, 3, 188--217.
[26]
M. De Wulf, L. Doyen, and J.-F. Raskin. 2006. A lattice theory for solving games of imperfect information. In Proceedings of the International Conference on Hybrid Systems: Computation and Control. Lecture Notes in Computer Science, vol. 3927, Springer, 153--168.
[27]
D. L. Dill. 1989. Trace Theory for Automatic Hierarchical Verification of Speed-independent Circuits. MIT Press.
[28]
R. Dimitrova and B. Finkbeiner 2008. Abstraction refinement for games with incomplete information. In Proceedings of the Conference on Foundations of Software Technology and Theoretical Computer Science. LIPIcs Series, vol. 2, 175--186.
[29]
L. Doyen and J.-F. Raskin. 2010. Antichains algorithms for finite automata. In Proceedings of the Workshop on Tools and Algorithms for the Construction and Analysis of Systems. Lecture Notes in Computer Science, vol. 6015, Springer, 2--22.
[30]
E. A. Emerson and C. Jutla. 1991. Tree automata, mu-calculus and determinacy. In Proceedings of the Annual Symposium on Foundations of Computer Science. 368--377.
[31]
K. Etessami and M. Yannakakis. 2006. Recursive concurrent stochastic games. In Proceedings of the International Colloquium on Automata, Languages and Programming. Springer.
[32]
N. Fijalkow, H. Gimbert, and Y. Oualhadj. 2012. Deciding the value 1 problem for probabilistic leaktight automata. In Proceedings of the Annual IEEE Symposium on Logic in Computer Science. 295--304.
[33]
A. Finkel and J. Goubault-Larrecq. 2012. Forward analysis for WSTS, part II: Complete WSTS. Logical Methods Comput. Sci. 8, 3.
[34]
H. Gimbert and Y. Oualhadj. 2010. Probabilistic automata on finite words: Decidable and undecidable problems. In Proceedings of the International Colloquium on Automata, Languages and Programming. Lecture Notes in Computer Science, vol. 6199, Springer, 527--538.
[35]
V. Gripon and O. Serre. 2009. Qualitative concurrent stochastic games with imperfect information. In Proceedings of the International Colloquium on Automata, Languages and Programming. Lecture Notes in Computer Science, vol. 5556, Springer, 200--211.
[36]
T. A. Henzinger and P. Kopke. 1999. Discrete-time control for rectangular hybrid automata. Theor. Comp. Sci. 221, 369--392.
[37]
W. Jamroga, S. Mauw, and M. Melissen. 2012. Fairness in non-repudiation protocols. In Security and Trust Management. Lecture Notes in Computer Science, vol. 7170, Springer, 122--139.
[38]
A. Kechris. 1995. Classical Descriptive Set Theory. Springer.
[39]
D. König. 1936. Theorie der endlichen und unendlichen Graphen. Akademische Verlagsgesellschaft, Leipzig.
[40]
H. Kress-Gazit, G. E. Fainekos, and G. J. Pappas. 2009. Temporal-logic-based reactive mission and motion planning. IEEE Trans. Rob. 25, 6, 1370--1381.
[41]
O. Kupferman and M. Y. Vardi. 2000. Synthesis with incomplete informatio. In Advances in Temporal Logic, Kluwer Academic Publishers, 109--127.
[42]
S. Nain and M. Y. Vardi 2013. Solving partial-information stochastic parity games. In Proceedings of the Annual IEEE Symposium on Logic in Computer Science. 341--348.
[43]
C. H. Papadimitriou and J. N. Tsitsiklis. 1987. The complexity of Markov decision processes. Math. Oper. Rese. 12, 441--450.
[44]
A. Paz. 1971. Introduction to Probabilistic Automata. Academic Press, Inc. Orlando, FL.
[45]
A. Pnueli and R. Rosner. 1989. On the synthesis of a reactive module. In Proceedings of the ACM Symposium on Principles of Programming Languages. ACM Press, 179--190.
[46]
M. O. Rabin. 1963. Probabilistic automata. Inform. Control 6, 230--245.
[47]
C. Rackoff. 1978. The covering and boundedness problems for vector addition systems. Theor. Comput. Sci. 6, 223--231.
[48]
P. J. Ramadge and W. M. Wonham. 1987. Supervisory control of a class of discrete-event processes. SIAM J. Control Optim. 25, 1, 206--230.
[49]
J. H. Reif. 1979. Universal games of incomplete information. In Proceedings of the Annual ACM Symposium on Theory of Computing. ACM, 288--308.
[50]
J. H. Reif. 1984. The complexity of two-player games of incomplete information. J. Comput. Syst. Sci. 29, 274--301.
[51]
J. H. Reif and G. L. Peterson. 1980. A dynamic logic of multiprocessing with incomplete information. In Proceedings of the ACM Symposium on Principles of Programming Languages. ACM, 193--202.
[52]
J. Renault. 2012. The value of repeated games with an informed controller. Math. Oper. Res. 37, 1, 154--179.
[53]
D. Rosenberg, E. Solan, and N. Vieille. 2003. The maxmin value of stochastic games with imperfect monitoring. Int. J. Game Theory 32, 1, 133--150.
[54]
L. E. Rosier and H.-C. Yen, 1986. A multiparameter analysis of the boundedness problem for vector addition systems. J. Comput. Syst. Sci. 32, 1, 105--135.
[55]
L. S. Shapley. 1953. Stochastic games. Proc. Nat. Acad. Sci. U.S.A. 39, 1095--1100.
[56]
S. Sorin. 2002. A First Course in Zero-Sum Repeated Games. Springer.
[57]
W. Thomas. 1997. Languages, automata, and logic. In Handbook of Formal Languages, Vol. 3, Beyond Words. Springer, Chapter 7, 389--455.
[58]
M. Tracol, C. Baier, and M. Grösser. 2009. Recurrence and transience for probabilistic automata. In Proceedings of the Conference on Foundations of Software Technology and Theoretical Computer Science. LIPIcs Series, vol. 4. 395--406.
[59]
M. Y. Vardi. 1985. Automatic verification of probabilistic concurrent finite-state systems. In Proceedings of the Annual Symposium on Foundations of Computer Science. 327--338.

Cited By

View all
  • (2024)Verification of Stochastic Multi-Agent Systems with Forgetful StrategiesProceedings of the 23rd International Conference on Autonomous Agents and Multiagent Systems10.5555/3635637.3662863(160-169)Online publication date: 6-May-2024
  • (2024)Partially-Observable Security Games for Attack-Defence Analysis in Software SystemsSoftware Engineering and Formal Methods10.1007/978-3-031-77382-2_9(144-161)Online publication date: 26-Nov-2024
  • (2023)HSVI Can Solve Zero-Sum Partially Observable Stochastic GamesDynamic Games and Applications10.1007/s13235-023-00519-614:4(751-805)Online publication date: 2-Sep-2023
  • Show More Cited By

Index Terms

  1. Partial-Observation Stochastic Games: How to Win when Belief Fails

      Recommendations

      Comments

      Information & Contributors

      Information

      Published In

      cover image ACM Transactions on Computational Logic
      ACM Transactions on Computational Logic  Volume 15, Issue 2
      April 2014
      257 pages
      ISSN:1529-3785
      EISSN:1557-945X
      DOI:10.1145/2616911
      Issue’s Table of Contents
      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 02 May 2014
      Accepted: 01 November 2013
      Received: 01 July 2013
      Published in TOCL Volume 15, Issue 2

      Permissions

      Request permissions for this article.

      Check for updates

      Author Tags

      1. Partial-observation games
      2. memory bounds
      3. positive and almost-sure winning
      4. reachability and Büchi objectives
      5. stochastic games
      6. strategy complexity

      Qualifiers

      • Research-article
      • Research
      • Refereed

      Funding Sources

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)35
      • Downloads (Last 6 weeks)1
      Reflects downloads up to 18 Feb 2025

      Other Metrics

      Citations

      Cited By

      View all
      • (2024)Verification of Stochastic Multi-Agent Systems with Forgetful StrategiesProceedings of the 23rd International Conference on Autonomous Agents and Multiagent Systems10.5555/3635637.3662863(160-169)Online publication date: 6-May-2024
      • (2024)Partially-Observable Security Games for Attack-Defence Analysis in Software SystemsSoftware Engineering and Formal Methods10.1007/978-3-031-77382-2_9(144-161)Online publication date: 26-Nov-2024
      • (2023)HSVI Can Solve Zero-Sum Partially Observable Stochastic GamesDynamic Games and Applications10.1007/s13235-023-00519-614:4(751-805)Online publication date: 2-Sep-2023
      • (2023)Model Checking for Probabilistic Multiagent SystemsJournal of Computer Science and Technology10.1007/s11390-022-1218-638:5(1162-1186)Online publication date: 1-Sep-2023
      • (2022)Probabilistic Model Checking and AutonomyAnnual Review of Control, Robotics, and Autonomous Systems10.1146/annurev-control-042820-0109475:1(385-410)Online publication date: 3-May-2022
      • (2022)An Overview of Opponent Modeling for Multi-agent CompetitionMachine Learning for Cyber Security10.1007/978-3-031-20096-0_48(634-648)Online publication date: 2-Dec-2022
      • (2021)Strategy Logic with Imperfect InformationACM Transactions on Computational Logic10.1145/342795522:1(1-51)Online publication date: 5-Jan-2021
      • (2021)Analysis and applications of a bridge gameJournal of Ambient Intelligence and Humanized Computing10.1007/s12652-021-03557-314:6(7033-7045)Online publication date: 1-Nov-2021
      • (2021)Runtime Monitors for Markov Decision ProcessesComputer Aided Verification10.1007/978-3-030-81688-9_26(553-576)Online publication date: 15-Jul-2021
      • (2020)Alternating Tree Automata with Qualitative SemanticsACM Transactions on Computational Logic10.1145/343186022:1(1-24)Online publication date: 17-Dec-2020
      • Show More Cited By

      View Options

      Login options

      Full Access

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Figures

      Tables

      Media

      Share

      Share

      Share this Publication link

      Share on social media