Net2Net Extension for the AlphaGo Zero Algorithm

Hsieh, Hsiao-Chung; Wu, Ti-Rong; Wei, Ting-Han; Wu, I-Chen

doi:10.1007/978-3-030-65883-0_11

Hsiao-Chung Hsieh¹²,
Ti-Rong Wu¹²,
Ting-Han Wei¹² &
…
I-Chen Wu¹²

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 12516))

Included in the following conference series:

Advances in Computer Games

512 Accesses

Abstract

The number of residual network blocks in a computer Go program following the AlphaGo Zero algorithm is one of the key factors to the program’s playing strength. In this paper, we propose a method to deepen the residual network without reducing performance. Next, as self-play tends to be the most time-consuming part of AlphaGo Zero training, we demonstrate how it is possible to continue training on this deepened residual network using the self-play records generated by the original network (for time saving). The deepening process is performed by inserting new layers into the original network. We present in this paper three insertion schemes based on the concept behind Net2Net. Lastly, of the many different ways to sample the previously generated self-play records, we propose two methods so that the deepened network can continue the training process. In our experiment on the extension from 20 residual blocks to 40 residual blocks for \(9 \times 9\) Go, the results show that the best performing extension scheme is able to obtain 61.69% win rate against the unextended player (20 blocks) while greatly saving the time for self-play.

H.-C. Hsieh and T.-R. Wu–Equal contribution.

This research is partially supported by the Ministry of Science and Technology (MOST) under Grant Number MOST 107-2634-F-009-011 and MOST 108-2634-F-009-011 through Pervasive Artificial Intelligence Research (PAIR) Labs, Taiwan and also partially supported by the Industrial Technology Research Institute (ITRI) of Taiwan under Grant Number B5-10804-HQ-01. The computing resource is partially supported by national center for high-performance computing (NCHC).

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Chen, T., Goodfellow, I., Shlens, J.: Net2Net: accelerating learning via knowledge transfer. arXiv preprint arXiv:1511.05641 (2015)
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016)
Google Scholar
Huang, G., Liu, Z., Van Der Maaten, L., Weinberger, K.Q.: Densely connected convolutional networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4700–4708 (2017)
Google Scholar
Ioffe, S., Szegedy, C.: Batch normalization: accelerating deep network training by reducing internal covariate shift. arXiv preprint arXiv:1502.03167 (2015)
Krizhevsky, A., Sutskever, I., Hinton, G.E.: ImageNet classification with deep convolutional neural networks. In: Advances in Neural Information Processing Systems, pp. 1097–1105 (2012)
Google Scholar
Pascutto, G.C.: Leela-Zero Github repository (2018). https://github.com/gcp/leela-zero
Silver, D., et al.: Mastering the game of Go without human knowledge. Nature 550(7676), 354 (2017)
Article Google Scholar
Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014)
Tian, Y., et al.: ELF OpenGo: an analysis and open reimplementation of AlphaZero. arXiv preprint arXiv:1902.04522 (2019)
Wu, I.C., Wu, T.R., Liu, A.J., Guei, H., Wei, T.: On strength adjustment for MCTS-based programs. In: Thirty-Third AAAI Conference on Artificial Intelligence (2019)
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Science, National Chiao Tung University, 1001 University Road, Hsinchu, Taiwan, ROC
Hsiao-Chung Hsieh, Ti-Rong Wu, Ting-Han Wei & I-Chen Wu

Authors

Hsiao-Chung Hsieh
View author publications
You can also search for this author in PubMed Google Scholar
Ti-Rong Wu
View author publications
You can also search for this author in PubMed Google Scholar
Ting-Han Wei
View author publications
You can also search for this author in PubMed Google Scholar
I-Chen Wu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to I-Chen Wu .

Editor information

Editors and Affiliations

Université Paris-Dauphine, Paris, France
Tristan Cazenave
Leiden University, The Hague, The Netherlands
Jaap van den Herik
The University of New South Wales, Sydney, NSW, Australia
Abdallah Saffidine
National Chiao-Tung University, Hsin-Chu, Taiwan
I-Chen Wu

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Hsieh, HC., Wu, TR., Wei, TH., Wu, IC. (2020). Net2Net Extension for the AlphaGo Zero Algorithm. In: Cazenave, T., van den Herik, J., Saffidine, A., Wu, IC. (eds) Advances in Computer Games. ACG 2019. Lecture Notes in Computer Science(), vol 12516. Springer, Cham. https://doi.org/10.1007/978-3-030-65883-0_11

Download citation

DOI: https://doi.org/10.1007/978-3-030-65883-0_11
Published: 20 December 2020
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-65882-3
Online ISBN: 978-3-030-65883-0
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics