Skip to main content

Reducing Hierarchical Deep Learning Networks as Game Playing Artefact Using Regret Matching

  • Chapter
  • First Online:
Handbook of Deep Learning Applications

Part of the book series: Smart Innovation, Systems and Technologies ((SIST,volume 136))

  • 3328 Accesses

Abstract

Human behavior prediction in strategic scenarios has been addressed through hierarchical deep learning networks in the recent past. Here we present a mathematical framework towards reduction of hierarchical deep learning network as game centric object. Considering simple game, we show the equivalence between training problem’s global minimizers and Nash equilibria. Then we extend the game to hierarchical deep learning networks where the correspondence revolving Nash equilibria and network’s critical points are addressed. With respect to these connections other learning methods are investigated. The experiments are done considering the artificial datasets which are developed from RPS game, CT experiments, Poker variant games as well as real MNSIT dataset. The experimental evaluation shows proposed framework’s efficiency. It is concluded that regret matching achieves good training performance than other deep learning networks.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. A. Chaudhuri, S.K. Ghosh, Hierarchical fuzzy deep learning networks for predicting human behavior in strategic setups, in Advances in Intelligent Systems and Computing, ed. by R. Silhavy, et al., vol. 764 (Springer, Switzerland, 2019), pp. 111–121

    Google Scholar 

  2. Y. Freund, R. Schapire, Adaptive game playing using multiplicative weights. Games Econ. Behav. 29(1–2), 79–103 (1999)

    Article  MathSciNet  Google Scholar 

  3. S. Shalev-Shwartz, Y. Singer, Convex repeated games and Fenchel duality, in Advances in Neural Information Processing Systems, vol. 19 (2006)

    Google Scholar 

  4. S. Shalev-Shwartz, Y. Singer, A primal-dual perspective of online learning algorithms. Mach. Learn. 69(2–3), 115–142 (2007)

    Article  Google Scholar 

  5. O. Tammelin, N. Burch, M. Johanson, M. Bowling, Solving heads-up limit Texas hold’em, in Proceedings of International Joint Conference on Artificial Intelligence, vol. 645–652 (2015)

    Google Scholar 

  6. T.M. Mitchell, Machine Learning (McGraw Hill, 2008)

    Google Scholar 

  7. S. Hart, A. Mas-Colell, A simple adaptive procedure leading to correlated equilibrium. Econometrica 68(5), 1127–1150 (2000)

    Article  MathSciNet  Google Scholar 

  8. N. Cesa-Bianchi, A. Conconi, C. Gentile, On the generalization ability of on-line learning algorithms. IEEE Trans. Inf. Theory 50(9), 2050–2057 (2004)

    Article  MathSciNet  Google Scholar 

  9. N. Ratliff, D. Bagnell, M. Zinkevich, Sub-gradient methods for structured prediction, in Proceedings of 11th International Conference on Artificial Intelligence and Statistics, vol. 2, pp. 380–387 (2007)

    Google Scholar 

  10. N. Ratliff, J.A. Bagnell, M. Zinkevich, Maximum margin planning, in Proceedings of 22nd International Conference on Machine Learning, pp. 729–736 (2006)

    Google Scholar 

  11. S. Shalev-Shwartz, Y. Singer, N. Srebro, A. Cotter, Pegasos: primal estimated sub-gradient solver for SVM. Math. Program. 127(1), 3–30 (2011)

    Article  MathSciNet  Google Scholar 

  12. M. Zinkevich, Online convex programming and generalized infinitesimal gradient ascent, in Proceedings of 20th International Conference on Machine Learning, pp. 928–935 (2003)

    Google Scholar 

  13. M. Kearns, S. Suri, N. Montfort, An experimental study of the coloring problem on human subject networks. Science 313, 824–827 (2006)

    Article  Google Scholar 

  14. M. Zinkevich, M. Bowling, M. Johanson, C. Piccione, Regret minimization in games with incomplete information, in Advances in Neural Information Processing Systems, vol. 20 (2007)

    Google Scholar 

  15. D. Balduzzi, Cortical prediction markets, in Proceedings of International Conference on Autonomous Agents and Multi-agent Systems, pp. 1265–1272 (2014)

    Google Scholar 

  16. D. Balduzzi, Deep online convex optimization using gated games (2016). http://arxiv.org/abs/1604.01952

  17. K. Hoeffgen, H. Simon, K. Van Horn, Robust trainability of single neurons. J. Comput. Syst. Sci. 52(2), 114–125 (1995)

    Article  MathSciNet  Google Scholar 

  18. R. Tagiew, Hypotheses about typical general human strategic behavior in a concrete case, in Emergent Perspectives in Artificial Intelligence. AI*IA, ed. by R. Serra, R. Cucchiara, LNCS, vol. 5883 (Springer, Heidelberg, 2009), pp. 476–485

    Chapter  Google Scholar 

  19. Y. Gal, A. Pfeffer, Modeling reciprocal behavior in human bilateral negotiation, in Proceedings of the 22nd National Conference on Artificial Intelligence, vol. 1, pp. 815–820 (2007)

    Google Scholar 

  20. M. Zinkevich, M. Littman, The AAAI computer poker competition. J. Int. Comput. Games Assoc. 29 (2006)

    Google Scholar 

  21. Y. LeCun, L. Bottou, Y. Bengio, P. Haffner, Gradient-based learning applied to document recognition. Proc. IEEE 86(11), 2278–2324 (1998)

    Article  Google Scholar 

  22. N. Cesa-Bianchi, G. Lugosi, Prediction, Learning and Games (Cambridge University Press, 2006)

    Google Scholar 

  23. A. Rakhlin, K. Sridharan, Optimization, learning, and games with predictable sequences, in Advances in Neural Information Processing Systems, vol. 26 (2013)

    Google Scholar 

  24. V. Syrgkanis, A. Agarwal, H. Luo, R. Schapire, Fast convergence of regularized learning in games, in Advances in Neural Information Processing Systems, vol. 28 (2015)

    Google Scholar 

  25. N. Littlestone, M. Warmuth, The weighted majority algorithm. Inf. Comput. 108(2), 212–261 (1994)

    Article  MathSciNet  Google Scholar 

  26. N. Srinivasan, V. Ravichandran, K. Chan, J. Vidhya, S. Ramakrishnan, S. Krishnan, Exponentiated backpropagation algorithm for multilayer feedforward neural networks, in Proceedings of International Conference on Neural Information Processing, vol. 1 (2002)

    Google Scholar 

  27. G. Gordon, No-regret algorithms for structured prediction problems. Technical Report CMU-CALD-05112, Carnegie Mellon University (2005)

    Google Scholar 

  28. G. Gordon, No-regret algorithms for online convex programs, in Advances in Neural Information Processing Systems, vol. 19 (2006)

    Google Scholar 

  29. J. Duchi, S. Shalev-Shwartz, Y. Singer, T. Chandra, Efficient projections onto the l1-ball for learning in high dimensions, in Proceedings of International Conference on Machine Learning, pp. 272–279 (2008)

    Google Scholar 

  30. S. Shalev-Shwartz, Online learning and online convex optimization. Found. Trends Mach. Learn. 4(2), 107–194 (2012)

    Article  Google Scholar 

  31. D. Schuurmans, M. Zinkevich, Deep learning games, in Advances in Neural Information Processing Systems, vol. 29 (2016)

    Google Scholar 

  32. W. Karush, Minima of functions of several variables with inequalities as side constraints. Master’s Thesis, University of Chicago, Chicago, Illinois (1939)

    Google Scholar 

  33. H. Kuhn, A. Tucker, Nonlinear programming, in Proceedings of 2nd Berkeley Symposium, University of California Press, pp. 481–492 (1951)

    Google Scholar 

  34. M. Johanson, N. Bard, N. Burch, M. Bowling, Finding optimal abstract strategies in extensive form games, in Proceedings of AAAI Conference on Artificial Intelligence, pp. 1371–1379 (2012)

    Google Scholar 

  35. J. Lee, M. Simchowitz, M. Jordan, B. Recht, Gradient descent only converges to minimizers, in Proceedings of 29th Annual Conference on Learning Theory, vol. 49, pp. 1246–1257 (2016)

    Google Scholar 

  36. L. Bottou, Stochastic gradient descent tricks, Neural Networks: Tricks of the Trade, 2nd edn, pp. 421–436 (2012)

    Google Scholar 

  37. J. Duchi, E. Hazan, Y. Singer, Adaptive sub-gradient methods for online learning and stochastic optimization. J. Mach. Learn. Res. 12, 2121–2159 (2011)

    MathSciNet  MATH  Google Scholar 

  38. D. Kingma, J.B. Adam, A method for stochastic optimization (2014). http://arxiv.org/abs/1412.6980

  39. I. Sutskever, J. Martens, G. Dahl, G. Hinton, On the importance of initialization and momentum in deep learning, in Proceedings of 29th International Conference on Machine Learning, pp. 1139–1147 (2013)

    Google Scholar 

  40. A. Razborov, On small depth threshold circuits, in Proceedings of Scandinavian Workshop on Algorithm Theory, pp. 42–52 (1992)

    Chapter  Google Scholar 

  41. A. Hajnal, Threshold circuits of bounded depth. J. Comput. Syst. Sci. 46(2), 129–154 (1993)

    Article  MathSciNet  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Arindam Chaudhuri .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Switzerland AG

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Chaudhuri, A., Ghosh, S.K. (2019). Reducing Hierarchical Deep Learning Networks as Game Playing Artefact Using Regret Matching. In: Balas, V., Roy, S., Sharma, D., Samui, P. (eds) Handbook of Deep Learning Applications. Smart Innovation, Systems and Technologies, vol 136. Springer, Cham. https://doi.org/10.1007/978-3-030-11479-4_16

Download citation

Publish with us

Policies and ethics