Abstract
The number of residual network blocks in a computer Go program following the AlphaGo Zero algorithm is one of the key factors to the program’s playing strength. In this paper, we propose a method to deepen the residual network without reducing performance. Next, as self-play tends to be the most time-consuming part of AlphaGo Zero training, we demonstrate how it is possible to continue training on this deepened residual network using the self-play records generated by the original network (for time saving). The deepening process is performed by inserting new layers into the original network. We present in this paper three insertion schemes based on the concept behind Net2Net. Lastly, of the many different ways to sample the previously generated self-play records, we propose two methods so that the deepened network can continue the training process. In our experiment on the extension from 20 residual blocks to 40 residual blocks for \(9 \times 9\) Go, the results show that the best performing extension scheme is able to obtain 61.69% win rate against the unextended player (20 blocks) while greatly saving the time for self-play.
H.-C. Hsieh and T.-R. Wu–Equal contribution.
This research is partially supported by the Ministry of Science and Technology (MOST) under Grant Number MOST 107-2634-F-009-011 and MOST 108-2634-F-009-011 through Pervasive Artificial Intelligence Research (PAIR) Labs, Taiwan and also partially supported by the Industrial Technology Research Institute (ITRI) of Taiwan under Grant Number B5-10804-HQ-01. The computing resource is partially supported by national center for high-performance computing (NCHC).
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Chen, T., Goodfellow, I., Shlens, J.: Net2Net: accelerating learning via knowledge transfer. arXiv preprint arXiv:1511.05641 (2015)
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016)
Huang, G., Liu, Z., Van Der Maaten, L., Weinberger, K.Q.: Densely connected convolutional networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4700–4708 (2017)
Ioffe, S., Szegedy, C.: Batch normalization: accelerating deep network training by reducing internal covariate shift. arXiv preprint arXiv:1502.03167 (2015)
Krizhevsky, A., Sutskever, I., Hinton, G.E.: ImageNet classification with deep convolutional neural networks. In: Advances in Neural Information Processing Systems, pp. 1097–1105 (2012)
Pascutto, G.C.: Leela-Zero Github repository (2018). https://github.com/gcp/leela-zero
Silver, D., et al.: Mastering the game of Go without human knowledge. Nature 550(7676), 354 (2017)
Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014)
Tian, Y., et al.: ELF OpenGo: an analysis and open reimplementation of AlphaZero. arXiv preprint arXiv:1902.04522 (2019)
Wu, I.C., Wu, T.R., Liu, A.J., Guei, H., Wei, T.: On strength adjustment for MCTS-based programs. In: Thirty-Third AAAI Conference on Artificial Intelligence (2019)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Switzerland AG
About this paper
Cite this paper
Hsieh, HC., Wu, TR., Wei, TH., Wu, IC. (2020). Net2Net Extension for the AlphaGo Zero Algorithm. In: Cazenave, T., van den Herik, J., Saffidine, A., Wu, IC. (eds) Advances in Computer Games. ACG 2019. Lecture Notes in Computer Science(), vol 12516. Springer, Cham. https://doi.org/10.1007/978-3-030-65883-0_11
Download citation
DOI: https://doi.org/10.1007/978-3-030-65883-0_11
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-65882-3
Online ISBN: 978-3-030-65883-0
eBook Packages: Computer ScienceComputer Science (R0)