Approximating Optimal Dudo Play with Fixed-Strategy Iteration Counterfactual Regret Minimization

Neller, Todd W.; Hnath, Steven

doi:10.1007/978-3-642-31866-5_15

Approximating Optimal Dudo Play with Fixed-Strategy Iteration Counterfactual Regret Minimization

Todd W. Neller¹⁷ &
Steven Hnath¹⁷

Conference paper

1770 Accesses
1 Citations

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 7168))

Abstract

Using the bluffing dice game Dudo as a challenge domain, we abstract information sets by an imperfect recall of actions. Even with such abstraction, the standard Counterfactual Regret Minimization (CFR) algorithm proves impractical for Dudo, since the number of recursive visits to the same abstracted information sets increase exponentially with the depth of the game graph. By holding strategies fixed across each training iteration, we show how CFR training iterations may be transformed from an exponential-time recursive algorithm into a polynomial-time dynamic-programming algorithm, making computation of an approximate Nash equilibrium for the full 2-player game of Dudo possible for the first time.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Hart, S., Mas-Colell, A.: A simple adaptive procedure leading to correlated equilibrium. Econometrica 68(5), 1127–1150 (2000)
Article MathSciNet MATH Google Scholar
Jacobs, G.: The World’s Best Dice Games, new edn. John N. Hansen Co., Inc., Milbrae (1993)
Google Scholar
Knizia, R.: Dice Games Properly Explained. Elliot Right-Way Books, Brighton Road, Lower Kingswood, Tadworth, Surrey, KT20 6TD U.K (1999)
Google Scholar
Koller, D., Megiddo, N., von Stengel, B.: Fast algorithms for finding randomized strategies in game trees. In: Proceedings of the 26th ACM Symposium on Theory of Computing (STOC 1994), pp. 750–759 (1994)
Google Scholar
Lanctot, M., Waugh, K., Zinkevich, M., Bowling, M.: Monte carlo sampling for regret minimization in extensive games. In: Bengio, Y., Schuurmans, D., Lafferty, J., Williams, C.K.I., Culotta, A. (eds.) Advances in Neural Information Processing Systems 22, pp. 1078–1086. MIT Press (2009)
Google Scholar
Mohr, M.S.: The New Games Treasury. Houghton Mifflin, Boston (1993)
Google Scholar
Risk, N.A., Szafron, D.: Using counterfactual regret minimization to create competitive multiplayer poker agents. In: Proceedings of the 9th International Conference on Autonomous Agents and Multiagent Systems, AAMAS 2010, International Foundation for Autonomous Agents and Multiagent Systems, Richland, SC, vol. 1, pp. 159–166 (2010), http://portal.acm.org/citation.cfm?id=1838206.1838229
Waugh, K., Schnizlein, D., Bowling, M.H., Szafron, D.: Abstraction pathologies in extensive games. In: Sierra, C., Castelfranchi, C., Decker, K.S., Sichman, J.S. (eds.) AAMAS (2). pp. 781–788. IFAAMAS (2009)
Google Scholar
Waugh, K., Zinkevich, M., Johanson, M., Kan, M., Schnizlein, D., Bowling, M.H.: A practical use of imperfect recall. In: Bulitko, V., Beck, J.C. (eds.) SARA. AAAI (2009)
Google Scholar
Zinkevich, M., Johanson, M., Bowling, M., Piccione, C.: Regret minimization in games with incomplete information. In: Platt, J., Koller, D., Singer, Y., Roweis, S. (eds.) Advances in Neural Information Processing Systems 20, pp. 1729–1736. MIT Press, Cambridge (2008)
Google Scholar

Download references

Author information

Authors and Affiliations

Dept. of Computer Science, Gettysburg College, Gettysburg, Pennsylvania, 17325, USA
Todd W. Neller & Steven Hnath

Authors

Todd W. Neller
View author publications
You can also search for this author in PubMed Google Scholar
Steven Hnath
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Tilburg Institute of Cognition and Communication, Tilburg University, Warandelaan 2, 5037 AB, Tilburg, The Netherlands
H. Jaap van den Herik & Aske Plaat &

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Neller, T.W., Hnath, S. (2012). Approximating Optimal Dudo Play with Fixed-Strategy Iteration Counterfactual Regret Minimization. In: van den Herik, H.J., Plaat, A. (eds) Advances in Computer Games. ACG 2011. Lecture Notes in Computer Science, vol 7168. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-31866-5_15

Download citation

DOI: https://doi.org/10.1007/978-3-642-31866-5_15
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-31865-8
Online ISBN: 978-3-642-31866-5
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics