A novel deep residual network-based incomplete information competition strategy for four-players Mahjong games

Wang, Mingyan; Yan, Tianwei; Luo, Mingyuan; Huang, Wei

doi:10.1007/s11042-019-7682-5

A novel deep residual network-based incomplete information competition strategy for four-players Mahjong games

Published: 04 May 2019

Volume 78, pages 23443–23467, (2019)
Cite this article

Multimedia Tools and Applications Aims and scope Submit manuscript

Mingyan Wang¹,
Tianwei Yan¹,
Mingyuan Luo¹ &
…
Wei Huang¹

970 Accesses
5 Citations
Explore all metrics

Abstract

The game theory is widely acknowledged to benefit a lot from recent advances in deep learning, and intelligent competition strategies have been proposed for both complete information games and incomplete information games in recent years. In this paper, the four-players Chinese Mahjong game, which is a typical incomplete information game, is emphasized, a low-level semantic pseudo image generated based on game related prior knowledge and a novel deep residual network-based competition strategy are introduced to play the Chines Mahjong game. Technically, the deep learning within this new competition strategy is realized by a series of “GoBlock”, which is a new deep learning model structure introduced in this paper. Also, the “GoBlock” is further made up of several “Inception+” sub-structures, which is novel as well. Comprehensive experiments are conducted to reveal the superiority of this new competition strategy. A great number of the Chinese Mahjong game data have been collected from an online Chinese Mahjong company to construct the dataset in this study, and the newly proposed competition strategy has been compared with several shallow learning-based methods as well as deep learning-based methods. Both qualitative and quantitative analysis have been conducted based on outcomes obtained by all compared methods, and the superiority of the new competition strategy over others are suggested. Furthermore, an interesting competition among the new AI competition strategy and three real senior players are also introduced in this paper. The effectiveness and efficiency of the new competition strategy over real senior human players are also revealed by quantitative analysis based on four measures, from the statistical point of view. It is also necessary to point out that, this work is the first attempt to tackle the Mahjong game, which is a typical incomplete information game, from the deep learning perspective.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Review of deep learning: concepts, CNN architectures, challenges, applications, future directions

Article Open access 31 March 2021

Laith Alzubaidi, Jinglan Zhang, … Laith Farhan

Multi-agent deep reinforcement learning: a survey

Article Open access 15 April 2021

Sven Gronauer & Klaus Diepold

Applications of game theory in deep learning: a survey

Article 09 February 2022

Tanmoy Hazra & Kushal Anjaria

References

Bahdanau D, Chorowski J, Serdyuk D, Brakel P, Bengio Y (2015) End-to-end attention-based large vocabulary speech recognition. https://doi.org/10.1109/icassp.2016.7472618
Bansal T, Pachocki J, Sidor S et al (2017) Emergent complexity via multi-agent competition. arXiv:1710.03748
Bowling M, Burch N, Johanson M, Tammelin O (2017) Heads-up limit hold’em poker is solved. Science 347(6218):145–149. https://doi.org/10.1145/3131284
Article Google Scholar
Brown N, Sandholm T (2017) Reduced space and faster convergence in imperfect-information games via pruning. In: International conference on machine learning, pp 596–604. http://proceedings.mlr.press/v70/brown17a.html
Brown N, Sandholm T (2017) Safe and nested subgame solving for imperfect-information games. In: Advances in neural information processing systems, pp 689–699. arXiv:1705.02955
Chen Y, Li J, Xiao H, Jin X, Yan S, Feng J (2017) Dual path networks. arXiv:1707.01629
Cho K, Van Merrienboer B, Gulcehre C, Bahdanau D, Bougares F, Schwenk H et al (2014) Learning phrase representations using rnn encoder-decoder for statistical machine translation. Computer Science, https://doi.org/10.3115/v1/d14-1179
Cortes C, Vapnik V (1995) Support-vector networks. Mach Learn 20(3):273–297. https://doi.org/10.1007/BF00994018
MATH Google Scholar
Drachen A, Yancey M, Maguire J, Chu D, Wang IY, Mahlmann T, Klabajan D (2014) Skill-based differences in spatio-temporal team behaviour in defence of the ancients 2 (dota 2). In: Games media entertainment (GEM), 2014 IEEE. IEEE, pp 1–8. https://doi.org/10.1109/gem.2014.7048109
Figurnov M, Collins MD, Zhu Y, Zhang L, Huang J, Vetrov DP, Salakhutdinov R (2017) Spatially adaptive computation time for residual networks. In: CVPR, vol 2, p 7. https://doi.org/10.1109/cvpr.2017.194
Gers FA, Schmidhuber J, Cummins F (2000) Learning to forget: continual prediction with lstm. Neural Comput 12(10):2451–2471. https://doi.org/10.1049/cp:19991218
Article Google Scholar
Goodfellow I, Pouget-Abadie J, Mirza M, Xu B, Warde-Farley D, Ozair S, Bengio Y (2014) Generative adversarial nets. In: Advances in neural information processing systems, pp 2672–2680. http://papers.nips.cc/paper/5423-generative-adversarial-nets
He K, Zhang X, Ren S, Sun J (2015) Deep residual learning for image recognition, pp 770–778, https://doi.org/10.1109/cvpr.2016.90
He K, Zhang X, Ren S, Sun J (2016) Identity mappings in deep residual networks. In: European conference on computer vision. Springer, Cham, pp 630–645. https://doi.org/10.1007/978-3-319-46493-0_38
Heinrich J, Silver D (2016) Deep reinforcement learning from self-play in imperfect-information games. arXiv:1603.01121
Helpman E (1987) Imperfect competition and international trade: evidence from fourteen industrial countries. J Jpn Int Econ 1(1):62–81. https://doi.org/10.1016/0889-1583(87)90027-X
Article Google Scholar
Hinton GE, Salakhutdinov RR (2006) Reducing the dimensionality of data with neural networks. Science 313(5786):504–507. https://doi.org/10.1126/science.1127647
Article MathSciNet MATH Google Scholar
Huang G, Liu Z, Van Der Maaten L, Weinberger KQ (2017) Densely connected convolutional networks, vol 1. CVPR, p 3. https://doi.org/10.1109/cvpr.2017.243
Ioffe S, Szegedy C (2015) Batch normalization: accelerating deep network training by reducing internal covariate shift. arXiv:1502.03167
Johnson J, Alahi A, Fei-Fei L (2016) Perceptual losses for real-time style transfer and super-resolution. In: European conference on computer vision. Springer, Cham, pp 694–711. https://doi.org/10.1007/978-3-319-46475-6_43
Kingma DP, Welling M (2013) Auto-encoding variational bayes. arXiv:1312.6114
Kingma DP, Ba J (2014) Adam: a method for stochastic optimization. arXiv:1412.6980
Kingma DP, Dhariwal P (2018) Glow: generative flow with invertible 1x1 convolutions. arXiv:1807.03039
Krizhevsky A, Sutskever I, Hinton GE (2012) ImageNet classification with deep convolutional neural networks. In: International conference on neural information processing systems, vol 60. Curran Associates Inc, pp 1097–1105. https://doi.org/10.1145/3065386
Mason L, Baxter J, Bartlett PL, Frean MR (2000) Boosting algorithms as gradient descent. In: Advances in neural information processing systems, pp 512–518. https://dblp.org/rec/conf/nips/MasonBBF99
Mizukami N, Tsuruoka Y (2015) Building a computer Mahjong player based on Monte Carlo simulation and opponent models. In: IEEE conference on computational intelligence and games. IEEE, pp 275–283. https://doi.org/10.1109/cig.2015.7317929
Moračík M, Schmid M, Burch N, Lisý V., Morrill D, Bard N et al (2017) Deepstack: expert-level artificial intelligence in heads-up no-limit poker. Science 356(6337):508. https://doi.org/10.1126/science.aam6960
Article MathSciNet MATH Google Scholar
Nash J (1951) Non-cooperative games. Ann Math 54(2):286–295. https://doi.org/10.1515/9781400884087-009
Article MathSciNet MATH Google Scholar
Silver D, Huang A, Maddison CJ, Guez A, Sifre L, van den Driessche G et al (2016) Mastering the game of go with deep neural networks and tree search. Nature 529(7587):484–489. https://doi.org/10.1038/nature16961
Article Google Scholar
Silver D, Schrittwieser J, Simonyan K, Antonoglou I, Huang A, Guez A et al (2017) Mastering the game of go without human knowledge. Nature 550 (7676):354–359. https://doi.org/10.1038/nature24270
Article Google Scholar
Silver D, Hubert T, Schrittwieser J, Antonoglou I, Lai M, Guez A (2017) Mastering chess and shogi by self-play with a general reinforcement learning algorithm, Lillicrap, T, arXiv:1712.01815
Simonyan K, Zisserman A (2014) Very deep convolutional networks for large-scale image recognition. arXiv:1409.1556
Szegedy C, Liu W, Jia Y, Sermanet P, Reed S, Anguelov D, Rabinovich A (2015) Going deeper with convolutions. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1–9. https://doi.org/10.1109/cvpr.2015.7298594
Szegedy C, Vanhoucke V, Ioffe S, Shlens J, Wojna Z (2016) Rethinking the inception architecture for computer vision. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2818–2826. https://doi.org/10.1109/cvpr.2016.308
Vinyals O, Ewalds T, Bartunov S, Georgiev P, Vezhnevets AS, Yeo M, Quan J (2017) Starcraft ii: a new challenge for reinforcement learning. arXiv:1708.04782
Von Neumann J (1959) On the theory of games of strategy. Contributions to the Theory of Games 4:13–42. https://doi.org/10.1515/9781400882168-003
MathSciNet MATH Google Scholar
Wang C, Yang H, Bartz C, Meinel C (2016) Image captioning with deep bidirectional lstms, https://doi.org/10.1145/2964284.2964299
Wang C (2017) RRA: recurrent residual attention for sequence learning. arXiv:1709.03714
Wang C, Yang H, Meinel C (2018) Image captioning with deep bidirectional lstms and multi-task learning. ACM Trans Multimed Comput Commun Appl 14(2s):1–20. https://doi.org/10.1145/3115432
Google Scholar
Wu Y, Schuster M, Chen Z, Le QV, Norouzi M, Macherey W, Klingner J (2016) Google’s neural machine translation system: bridging the gap between human and machine translation. arXiv:1609.08144
Xie S, Girshick R, Dollár P, Tu Z, He K (2017) Aggregated residual transformations for deep neural networks. In: 2017 IEEE conference on computer vision and pattern recognition (CVPR). IEEE, pp 5987–5995. https://doi.org/10.1109/cvpr.2017.634
Zagoruyko S, Komodakis N (2016) Wide residual networks. arXiv:1605.07146, https://doi.org/10.5244/c.30.87
Zambaldi V, Raposo D, Santoro A, Bapst V, Li Y, Babuschkin I, Shanahan M (2018) Relational deep reinforcement learning. arXiv:1806.01830

Download references

Acknowledgements

The authors would like to acknowledge the grant 61862043 approved by National Natural Science Foundation of China, key grants 20181ACB20006 and 20171ACB21017 as well as grant 20161BAB212047 approved by Natural Science Foundation of Jiangxi Province for supporting this study.

Author information

Authors and Affiliations

Department of Computer Science, School of Information Engineering, Nanchang University, Nanchang, China
Mingyan Wang, Tianwei Yan, Mingyuan Luo & Wei Huang

Authors

Mingyan Wang
View author publications
You can also search for this author in PubMed Google Scholar
Tianwei Yan
View author publications
You can also search for this author in PubMed Google Scholar
Mingyuan Luo
View author publications
You can also search for this author in PubMed Google Scholar
Wei Huang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Wei Huang.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Wang, M., Yan, T., Luo, M. et al. A novel deep residual network-based incomplete information competition strategy for four-players Mahjong games. Multimed Tools Appl 78, 23443–23467 (2019). https://doi.org/10.1007/s11042-019-7682-5

Download citation

Received: 22 August 2018
Revised: 28 February 2019
Accepted: 24 April 2019
Published: 04 May 2019
Issue Date: 30 August 2019
DOI: https://doi.org/10.1007/s11042-019-7682-5

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

A novel deep residual network-based incomplete information competition strategy for four-players Mahjong games

Abstract

Access this article

Similar content being viewed by others

Review of deep learning: concepts, CNN architectures, challenges, applications, future directions

Multi-agent deep reinforcement learning: a survey

Applications of game theory in deep learning: a survey

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

A novel deep residual network-based incomplete information competition strategy for four-players Mahjong games

Abstract

Access this article

Similar content being viewed by others

Review of deep learning: concepts, CNN architectures, challenges, applications, future directions

Multi-agent deep reinforcement learning: a survey

Applications of game theory in deep learning: a survey

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation