Abstract
A newly introduced training-free neural architecture search (TE-NAS) framework suggests that candidate network architectures can be ranked via a combined metric of expressivity and trainability. Expressivity is measured by the number of linear regions in the input space that can be divided by a network. Trainability is assessed based on the condition number of the neural tangent kernel (NTK), which affects the convergence rate of training a network with gradient descent. These two measurements have been found to be correlated with network test accuracy. High-performance architectures can thus be searched for without incurring the intensive cost of network training as in a typical NAS run. In this paper, we suggest that TE-NAS can be incorporated with a multi-objective evolutionary algorithm (MOEA), in which expressivity and trainability are kept separate as two different objectives rather than being combined. We also add the minimization of floating-point operations (FLOPs) as the third objective to be optimized simultaneously. On NAS-Bench-101 and NAS-Bench-201 benchmarks, our approach achieves excellent efficiency in finding Pareto fronts of a wide range of architectures exhibiting optimal trade-offs among network expressivity, trainability, and complexity. Network architectures obtained by our approach on CIFAR-10 also show high transferability on CIFAR-100 and ImageNet.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Arora, S., Du, S.S., Hu, W., Li, Z., Salakhutdinov, R., Wang, R.: On exact computation with an infinitely wide neural net. In: NeurIPS, pp. 8139–8148 (2019)
Chen, W., Gong, X., Wang, Z.: Neural architecture search on ImageNet in four GPU hours: a theoretically inspired perspective. In: ICLR (2021)
Chrabaszcz, P., Loshchilov, I., Hutter, F.: A downsampled variant of ImageNet as an alternative to the CIFAR datasets. CoRR abs/1707.08819 (2017)
Chu, X., Zhang, B., Xu, R., Li, J.: FairNAS: rethinking evaluation fairness of weight sharing neural architecture search. CoRR abs/1907.01845 (2019)
Deb, K., Pratap, A., Agarwal, S., Meyarivan, T.: A fast and elitist multiobjective genetic algorithm: NSGA-II. IEEE Trans. Evol. Comput. 6(2), 182–197 (2002)
Do, T., Luong, N.H.: Insightful and practical multi-objective convolutional neural network architecture search with evolutionary algorithms. In: Fujita, H., Selamat, A., Lin, J.C.-W., Ali, M. (eds.) IEA/AIE 2021. LNCS (LNAI), vol. 12798, pp. 473–479. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-79457-6_40
Dong, X., Liu, L., Musial, K., Gabrys, B.: NATS-bench: benchmarking NAS algorithms for architecture topology and size. IEEE Trans. Pattern Anal. Mach. Intell., 1 (2021)
Dong, X., Yang, Y.: Searching for a robust neural architecture in four GPU hours. In: CVPR, pp. 1761–1770 (2019)
Dong, X., Yang, Y.: NAS-Bench-201: extending the scope of reproducible neural architecture search. In: ICLR (2020)
Du, S.S., Lee, J.D., Li, H., Wang, L., Zhai, X.: Gradient descent finds global minima of deep neural networks. In: ICML, vol. 97, pp. 1675–1685 (2019)
Du, S.S., Zhai, X., Póczos, B., Singh, A.: Gradient descent provably optimizes over-parameterized neural networks. In: ICLR (2019)
Giryes, R., Sapiro, G., Bronstein, A.M.: Deep neural networks with random gaussian weights: a universal classification strategy? IEEE Trans. Signal Process. 64(13), 3444–3457 (2016)
Guo, Z., et al.: Single path one-shot neural architecture search with uniform sampling. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12361, pp. 544–560. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58517-4_32
Hanin, B., Nica, M.: Finite depth and width corrections to the neural tangent kernel. In: ICLR (2020)
Hanin, B., Rolnick, D.: Complexity of linear regions in deep networks. In: ICML, vol. 97, pp. 2596–2604 (2019)
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016)
Jacot, A., Hongler, C., Gabriel, F.: Neural tangent kernel: convergence and generalization in neural networks. In: NeurIPS, pp. 8580–8589 (2018)
Krizhevsky, A.: Learning multiple layers of features from tiny images. Technical report, University of Toronto, Toronto (2009)
Lee, J., et al.: Wide neural networks of any depth evolve as linear models under gradient descent. In: NeurIPS, pp. 8570–8581 (2019)
Li, L., Talwalkar, A.: Random search and reproducibility for neural architecture search. In: UAI, pp. 367–377 (2019)
Liu, H., Simonyan, K., Yang, Y.: DARTS: differentiable architecture search. In: ICLR (2019)
Lu, Z., et al.: NSGA-Net: neural architecture search using multi-objective genetic algorithm. In: GECCO, pp. 419–427 (2019)
Luo, R., Tian, F., Qin, T., Chen, E., Liu, T.: Neural architecture optimization. In: NeurIPS, pp. 7827–7838 (2018)
Mellor, J., Turner, J., Storkey, A.J., Crowley, E.J.: Neural architecture search without training. In: ICML, vol. 139, pp. 7588–7598 (2021)
Pham, H., Guan, M.Y., Zoph, B., Le, Q.V., Dean, J.: Efficient neural architecture search via parameter sharing. In: ICML, vol. 80, pp. 4092–4101 (2018)
Real, E., Aggarwal, A., Huang, Y., Le, Q.V.: Regularized evolution for image classifier architecture search. In: AAAI, pp. 4780–4789 (2019)
Wu, B., et al.: FBNet: hardware-aware efficient convnet design via differentiable neural architecture search. In: CVPR, pp. 10734–10742 (2019)
Xiao, L., Pennington, J., Schoenholz, S.: Disentangling trainability and generalization in deep neural networks. In: ICML, vol. 119, pp. 10462–10472 (2020)
Xiong, H., Huang, L., Yu, M., Liu, L., Zhu, F., Shao, L.: On the number of linear regions of convolutional neural networks. In: ICML, vol. 119, pp. 10514–10523 (2020)
Ying, C., Klein, A., Christiansen, E., Real, E., Murphy, K., Hutter, F.: NAS-Bench-101: towards reproducible neural architecture search. In: ICML, vol. 97, pp. 7105–7114 (2019)
Yu, K., Sciuto, C., Jaggi, M., Musat, C., Salzmann, M.: Evaluating the search phase of neural architecture search. In: ICLR (2020)
Zoph, B., Le, Q.V.: Neural architecture search with reinforcement learning. In: International Conference on Learning Representations (ICLR) (2017)
Acknowledgements
This research is funded by Vietnam National University HoChiMinh City (VNU-HCM) under grant number DSC2021-26-06.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 Springer Nature Switzerland AG
About this paper
Cite this paper
Do, T., Luong, N.H. (2021). Training-Free Multi-objective Evolutionary Neural Architecture Search via Neural Tangent Kernel and Number of Linear Regions. In: Mantoro, T., Lee, M., Ayu, M.A., Wong, K.W., Hidayanto, A.N. (eds) Neural Information Processing. ICONIP 2021. Lecture Notes in Computer Science(), vol 13109. Springer, Cham. https://doi.org/10.1007/978-3-030-92270-2_29
Download citation
DOI: https://doi.org/10.1007/978-3-030-92270-2_29
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-92269-6
Online ISBN: 978-3-030-92270-2
eBook Packages: Computer ScienceComputer Science (R0)