Abstract
Human behavior prediction in strategic scenarios has been addressed through hierarchical deep learning networks in the recent past. Here we present a mathematical framework towards reduction of hierarchical deep learning network as game centric object. Considering simple game, we show the equivalence between training problem’s global minimizers and Nash equilibria. Then we extend the game to hierarchical deep learning networks where the correspondence revolving Nash equilibria and network’s critical points are addressed. With respect to these connections other learning methods are investigated. The experiments are done considering the artificial datasets which are developed from RPS game, CT experiments, Poker variant games as well as real MNSIT dataset. The experimental evaluation shows proposed framework’s efficiency. It is concluded that regret matching achieves good training performance than other deep learning networks.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
A. Chaudhuri, S.K. Ghosh, Hierarchical fuzzy deep learning networks for predicting human behavior in strategic setups, in Advances in Intelligent Systems and Computing, ed. by R. Silhavy, et al., vol. 764 (Springer, Switzerland, 2019), pp. 111–121
Y. Freund, R. Schapire, Adaptive game playing using multiplicative weights. Games Econ. Behav. 29(1–2), 79–103 (1999)
S. Shalev-Shwartz, Y. Singer, Convex repeated games and Fenchel duality, in Advances in Neural Information Processing Systems, vol. 19 (2006)
S. Shalev-Shwartz, Y. Singer, A primal-dual perspective of online learning algorithms. Mach. Learn. 69(2–3), 115–142 (2007)
O. Tammelin, N. Burch, M. Johanson, M. Bowling, Solving heads-up limit Texas hold’em, in Proceedings of International Joint Conference on Artificial Intelligence, vol. 645–652 (2015)
T.M. Mitchell, Machine Learning (McGraw Hill, 2008)
S. Hart, A. Mas-Colell, A simple adaptive procedure leading to correlated equilibrium. Econometrica 68(5), 1127–1150 (2000)
N. Cesa-Bianchi, A. Conconi, C. Gentile, On the generalization ability of on-line learning algorithms. IEEE Trans. Inf. Theory 50(9), 2050–2057 (2004)
N. Ratliff, D. Bagnell, M. Zinkevich, Sub-gradient methods for structured prediction, in Proceedings of 11th International Conference on Artificial Intelligence and Statistics, vol. 2, pp. 380–387 (2007)
N. Ratliff, J.A. Bagnell, M. Zinkevich, Maximum margin planning, in Proceedings of 22nd International Conference on Machine Learning, pp. 729–736 (2006)
S. Shalev-Shwartz, Y. Singer, N. Srebro, A. Cotter, Pegasos: primal estimated sub-gradient solver for SVM. Math. Program. 127(1), 3–30 (2011)
M. Zinkevich, Online convex programming and generalized infinitesimal gradient ascent, in Proceedings of 20th International Conference on Machine Learning, pp. 928–935 (2003)
M. Kearns, S. Suri, N. Montfort, An experimental study of the coloring problem on human subject networks. Science 313, 824–827 (2006)
M. Zinkevich, M. Bowling, M. Johanson, C. Piccione, Regret minimization in games with incomplete information, in Advances in Neural Information Processing Systems, vol. 20 (2007)
D. Balduzzi, Cortical prediction markets, in Proceedings of International Conference on Autonomous Agents and Multi-agent Systems, pp. 1265–1272 (2014)
D. Balduzzi, Deep online convex optimization using gated games (2016). http://arxiv.org/abs/1604.01952
K. Hoeffgen, H. Simon, K. Van Horn, Robust trainability of single neurons. J. Comput. Syst. Sci. 52(2), 114–125 (1995)
R. Tagiew, Hypotheses about typical general human strategic behavior in a concrete case, in Emergent Perspectives in Artificial Intelligence. AI*IA, ed. by R. Serra, R. Cucchiara, LNCS, vol. 5883 (Springer, Heidelberg, 2009), pp. 476–485
Y. Gal, A. Pfeffer, Modeling reciprocal behavior in human bilateral negotiation, in Proceedings of the 22nd National Conference on Artificial Intelligence, vol. 1, pp. 815–820 (2007)
M. Zinkevich, M. Littman, The AAAI computer poker competition. J. Int. Comput. Games Assoc. 29 (2006)
Y. LeCun, L. Bottou, Y. Bengio, P. Haffner, Gradient-based learning applied to document recognition. Proc. IEEE 86(11), 2278–2324 (1998)
N. Cesa-Bianchi, G. Lugosi, Prediction, Learning and Games (Cambridge University Press, 2006)
A. Rakhlin, K. Sridharan, Optimization, learning, and games with predictable sequences, in Advances in Neural Information Processing Systems, vol. 26 (2013)
V. Syrgkanis, A. Agarwal, H. Luo, R. Schapire, Fast convergence of regularized learning in games, in Advances in Neural Information Processing Systems, vol. 28 (2015)
N. Littlestone, M. Warmuth, The weighted majority algorithm. Inf. Comput. 108(2), 212–261 (1994)
N. Srinivasan, V. Ravichandran, K. Chan, J. Vidhya, S. Ramakrishnan, S. Krishnan, Exponentiated backpropagation algorithm for multilayer feedforward neural networks, in Proceedings of International Conference on Neural Information Processing, vol. 1 (2002)
G. Gordon, No-regret algorithms for structured prediction problems. Technical Report CMU-CALD-05112, Carnegie Mellon University (2005)
G. Gordon, No-regret algorithms for online convex programs, in Advances in Neural Information Processing Systems, vol. 19 (2006)
J. Duchi, S. Shalev-Shwartz, Y. Singer, T. Chandra, Efficient projections onto the l1-ball for learning in high dimensions, in Proceedings of International Conference on Machine Learning, pp. 272–279 (2008)
S. Shalev-Shwartz, Online learning and online convex optimization. Found. Trends Mach. Learn. 4(2), 107–194 (2012)
D. Schuurmans, M. Zinkevich, Deep learning games, in Advances in Neural Information Processing Systems, vol. 29 (2016)
W. Karush, Minima of functions of several variables with inequalities as side constraints. Master’s Thesis, University of Chicago, Chicago, Illinois (1939)
H. Kuhn, A. Tucker, Nonlinear programming, in Proceedings of 2nd Berkeley Symposium, University of California Press, pp. 481–492 (1951)
M. Johanson, N. Bard, N. Burch, M. Bowling, Finding optimal abstract strategies in extensive form games, in Proceedings of AAAI Conference on Artificial Intelligence, pp. 1371–1379 (2012)
J. Lee, M. Simchowitz, M. Jordan, B. Recht, Gradient descent only converges to minimizers, in Proceedings of 29th Annual Conference on Learning Theory, vol. 49, pp. 1246–1257 (2016)
L. Bottou, Stochastic gradient descent tricks, Neural Networks: Tricks of the Trade, 2nd edn, pp. 421–436 (2012)
J. Duchi, E. Hazan, Y. Singer, Adaptive sub-gradient methods for online learning and stochastic optimization. J. Mach. Learn. Res. 12, 2121–2159 (2011)
D. Kingma, J.B. Adam, A method for stochastic optimization (2014). http://arxiv.org/abs/1412.6980
I. Sutskever, J. Martens, G. Dahl, G. Hinton, On the importance of initialization and momentum in deep learning, in Proceedings of 29th International Conference on Machine Learning, pp. 1139–1147 (2013)
A. Razborov, On small depth threshold circuits, in Proceedings of Scandinavian Workshop on Algorithm Theory, pp. 42–52 (1992)
A. Hajnal, Threshold circuits of bounded depth. J. Comput. Syst. Sci. 46(2), 129–154 (1993)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this chapter
Cite this chapter
Chaudhuri, A., Ghosh, S.K. (2019). Reducing Hierarchical Deep Learning Networks as Game Playing Artefact Using Regret Matching. In: Balas, V., Roy, S., Sharma, D., Samui, P. (eds) Handbook of Deep Learning Applications. Smart Innovation, Systems and Technologies, vol 136. Springer, Cham. https://doi.org/10.1007/978-3-030-11479-4_16
Download citation
DOI: https://doi.org/10.1007/978-3-030-11479-4_16
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-11478-7
Online ISBN: 978-3-030-11479-4
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)