Abstract
Among different search strategies to realize neural architecture search (NAS), evolutionary algorithms (EAs) have gained much attention due to their global optimization capability. However, the large number of performance evaluation procedure makes the EA-based methods extremely time-consuming. To address this issue, an efficient framework called HENA (Hierarchical Evolution of Neural Architecture) is proposed in this paper. In HENA, NAS is hierarchically divided into two continuous phases, candidate operation search and connection relationship search. For this purpose, a supernet is defined to subsume the whole search space, where all weights can be inherited by all architectures. A novel evolutionary algorithm called state transition algorithm (STA) is used to traverse the search space continuously. In the first phase, several sampled architectures are trained on mini-group data to search better operation for different positions within the network. In the second phase, better connection relationship among operations can be directly determined without training via inheriting trained weights. Finally, the proposed method is evaluated on the widely used datasets, and the experimental results show that (1) the architecture learned by HENA obtains state-of-the-art performance (2.93% and 20.44% error rate on CIFAR-10 and CIFAR-100 respectively). (2) the learned architecture achieves 24.8% top-1 error on ImageNet dataset.








Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Data availability
Some or all data, models, or code generated or used during the study are available from the corresponding author by request.
References
Zhou X, Gao Y, Li C, Huang Z (2021) A multiple gradient descent design for multi-task learning on edge computing: multi-objective machine learning approach. IEEE Trans Netw Sci Eng 9(1):121-133
Schmidhuber J (2015) Deep learning in neural networks: an overview. Neural Netw 61:85–117
Sutskever I, Vinyals O, Le QV (2014) Sequence to sequence learning with neural networks. In: Advances in neural information processing systems, pp 3104–3112
Huang G, Liu Z, Van Der Maaten L, Weinberger KQ (2017) Densely connected convolutional networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 4700–4708
He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 770–778
Simonyan K, Zisserman A (2014) Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556
Hu J, Shen L, Sun G (2018) Squeeze-and-excitation networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 7132–7141
Howard AG, Zhu M, Chen B, Kalenichenko D, Wang W, Weyand T, Andreetto M, Adam H (2017) MobileNets: efficient convolutional neural networks for mobile vision applications. arXiv preprint arXiv:1704.04861
Sandler M, Howard A, Zhu M, Zhmoginov A, Chen L-C (2018) MobileNetV2: inverted residuals and linear bottlenecks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 4510–4520
Real E, Aggarwal A, Huang Y, Le QV (2019) Regularized evolution for image classifier architecture search. In: Proceedings of the AAAI conference on artificial intelligence, vol 33, pp 4780–4789
Elsken T, Metzen JH, Hutter F (2019) Neural architecture search: a survey. J Mach Learn Res 20(1):1997–2017
Zoph B, Le QV (2016) Neural architecture search with reinforcement learning. arXiv preprint arXiv:1611.01578
Guo J, Han K, Wang Y, Zhang C, Yang Z, Wu H, Chen X, Xu C (2020) Hit-detector: hierarchical trinity architecture search for object detection. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 11405–11414
Dong H, Zou B, Zhang L, Zhang S (2020) Automatic design of CNNS via differentiable neural architecture search for PolSAR image classification. IEEE Trans Geosci Remote Sens 58(9):6362–6375
Liu Y, Sun Y, Xue B, Zhang M, Yen GG, Tan KC (2021) A survey on evolutionary neural architecture search. IEEE Trans Neural Netw Learn Syst
Xie L, Yuille A (2017) Genetic CNN. In: Proceedings of the IEEE international conference on computer vision, pp 1379–1388
Real E, Moore S, Selle A, Saxena S, Suematsu YL, Tan J, Le QV, Kurakin A (2017) Large-scale evolution of image classifiers. In: International conference on machine learning, PMLR, pp 2902–2911
Song D, Xu C, Jia X, Chen Y, Xu C, Wang Y (2020) Efficient residual dense block search for image super-resolution. In: Proceedings of the AAAI conference on artificial intelligence, vol 34, pp 12007–12014
Sun Y, Xue B, Zhang M, Yen GG, Lv J (2020) Automatically designing CNN architectures using the genetic algorithm for image classification. IEEE Trans Cybern 50(9):3840–3854
Zoph B, Vasudevan V, Shlens J, Le QV (2018) Learning transferable architectures for scalable image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 8697–8710
Zhong Z, Yang Z, Deng B, Yan J, Wu W, Shao J, Liu C (2021) BlockQNN: efficient block-wise neural network architecture generation. IEEE Trans Pattern Anal Mach Intell 43(7):2314–2328
Liu H, Simonyan K, Yang Y (2018) Darts: differentiable architecture search. arXiv preprint arXiv:1806.09055
Sun Y, Xue B, Zhang M, Yen GG (2019) Completely automated CNN architecture design based on blocks. IEEE Trans Neural Netw Learn Syst 31(4):1242–1254
Zhou X, Gao DY, Yang C, Gui W (2016) Discrete state transition algorithm for unconstrained integer optimization problems. Neurocomputing 173:864–874
Zhou X, Yang C, Gui W (2012) State transition algorithm. J Ind Manag Optim 33(12):1039–1056
Zhou X, Yang C, Gui W (2019) A statistical study on parameter selection of operators in continuous state transition algorithm. IEEE Trans Cybern 49(10):3722–3730
Yang C, Tang X, Zhou X, Gui W (2013) A discrete state transition algorithm for traveling salesman problem. Control Theory Appl 30(8):1040–1046
Dong T, Yang C, Zhou X, Gui W (2016) A novel discrete state transition algorithm for staff assignment problem. Control Theory Appl 33(10):1378–1388
Zhou X, Gao DY, Simpson AR (2016) Optimal design of water distribution networks by a discrete state transition algorithm. Eng Optim 48(4):603–628
Zhou X, Yang K, Xie Y, Yang C, Huang T (2019) A novel modularity-based discrete state transition algorithm for community detection in networks. Neurocomputing 334:89–99
Huang Z, Yang C, Zhou X, Huang T (2019) A hybrid feature selection method based on binary state transition algorithm and reliefF. IEEE J Biomed Health Inform 23(5):1888–1898
Zhou X, Zhang R, Yang C et al (2020) A hybrid feature selection method for production condition recognition in froth flotation with noisy labels. Miner Eng 153:106201
Cai H, Zhu L, Han S (2018) ProxylessNas: direct neural architecture search on target task and hardware. arXiv preprint arXiv:1812.00332
Pham H, Guan M, Zoph B, Le Q, Dean J (2018) Efficient neural architecture search via parameters sharing. In: International conference on machine learning, PMLR, pp 4095–4104
Xie S, Zheng H, Liu C, Lin L (2018) SNAS: stochastic neural architecture search. arXiv preprint arXiv:1812.09926
Wei C, Niu C, Tang Y, Wang Y, Hu H, Liang J (2022) NPENAS: neural predictor guided evolution for neural architecture search. IEEE Trans Neural Netw Learn Syst
Sun Y, Wang H, Xue B, Jin Y, Yen GG, Zhang M (2019) Surrogate-assisted evolutionary deep learning using an end-to-end random forest-based performance predictor. IEEE Trans Evol Comput 24(2):350–364
Tang Y, Wang Y, Xu Y, Chen H, Shi B, Xu C, Xu C, Tian Q, Xu C (2020) A semi-supervised assessor of neural architectures. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 1810–1819
Guan C, Wang X, Zhu W (2021) AutoAttend: automated attention representation search. In: International conference on machine learning, PMLR, pp 3864–3874
Liu S, Zhang H, Jin Y (2022) A survey on computationally efficient neural architecture search. J Autom Intell 1(1):100002
Guo Z, Zhang X, Mu H, Heng W, Liu Z, Wei Y, Sun J (2020) Single path one-shot neural architecture search with uniform sampling. In: European conference on computer vision, Springer, pp 544–560
Bender G, Kindermans P-J, Zoph B, Vasudevan V, Le Q (2018) Understanding and simplifying one-shot architecture search. In: International conference on machine learning, PMLR, pp 550–559
Brock A, Lim T, Ritchie JM, Weston N (2017) SMASH: one-shot model architecture search through hypernetworks. arXiv preprint arXiv:1708.05344
Wu B, Dai X, Zhang P, Wang Y, Sun F, Wu Y, Tian Y, Vajda P, Jia Y, Keutzer K (2019) FBNet: hardware-aware efficient ConvNET design via differentiable neural architecture search. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 10734–10742
Zhang X, Huang Z, Wang N, Xiang S, Pan C (2020) You only search once: single shot neural architecture search via direct sparse optimization. IEEE Trans Pattern Anal Mach Intell 43(9):2891–2904
Wang R, Cheng M, Chen X, Tang X, Hsieh C-J (2021) Rethinking architecture selection in differentiable NAS. In: International conference on learning representation
Yang Y, Li H, You S, Wang F, Qian C, Lin Z (2020) ISTA-NAS: efficient and consistent neural architecture search by sparse coding. Adv Neural Inf Process Syst 33:10503–10513
Veniat T, Denoyer L (2018) Learning time/memory-efficient deep architectures with budgeted super networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3492–3500
Wang Q, Wu B, Zhu P, Li P, Zuo W, Hu Q (2020) ECA-Net: efficient channel attention for deep convolutional neural networks. In: 2020 IEEE/CVF conference on computer vision and pattern recognition, pp 11531–11539
Ying C, Klein A, Christiansen E, Real E, Murphy K, Hutter F (2019) NAS-Bench-101: towards reproducible neural architecture search. In: International conference on machine learning, PMLR, pp 7105–7114
Szegedy C, Ioffe S, Vanhoucke V, Alemi AA (2017) Inception-v4, inception-ResNet and the impact of residual connections on learning. In: Proceedings of the AAAI conference on artificial intelligence, pp 4278–4284
Szegedy C, Vanhoucke V, Ioffe S, Shlens J, Wojna Z (2016) Rethinking the inception architecture for computer vision. In: IEEE conference on computer vision and pattern recognition, pp 2818–2826
Howard A, Pang R, Adam H, Le QV, Sandler M, Chen B, Wang W, Chen L, Tan M, Chu G, Vasudevan V, Zhu Y (2019) Searching for MobileNetV3. In: Proceedings of the IEEE international conference on computer vision, pp 1314–1324
Yang Z, Wang Y, Chen X, Shi B, Xu C, Xu C, Tian Q, Xu C (2020) CARS: continuous evolution for efficient neural architecture search. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 1829–1838
Hochreiter S, Schmidhuber J (1997) Long short-term memory. Neural Comput 9(8):1735–1780
Gers FA, Schmidhuber J, Cummins F (2000) Learning to forget: continual prediction with LSTM. Neural Comput 12(10):2451–2471
Hajek B (1988) Cooling schedules for optimal annealing. Math Oper Res 13(2):311–329
Krizhevsky A, Hinton G et al (2009) Learning multiple layers of features from tiny images. Master’s thesis, University of Tront
Deng J, Dong W, Socher R, Li L-J, Li K, Fei-Fei L (2009) ImageNet: a large-scale hierarchical image database. In: 2009 IEEE conference on computer vision and pattern recognition, IEEE, pp 248–255
DeVries T, Taylor GW (2017) Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552
Loshchilov I, Hutter F (2016) SGDR: stochastic gradient descent with warm restarts. arXiv preprint arXiv:1608.03983
Liu C, Zoph B, Neumann M, Shlens J, Hua W, Li L-J, Fei-Fei L, Yuille A, Huang J, Murphy K (2018) Progressive neural architecture search. In: Proceedings of the European conference on computer vision, pp 19–34
Baker B, Gupta O, Naik N, Raskar R (2016) Designing neural network architectures using reinforcement learning. arXiv preprint arXiv:1611.02167
Lu Z, Whalen I, Boddeti V, Dhebar Y, Deb K, Goodman E, Banzhaf W (2019) NSGA-Net: neural architecture search using multi-objective genetic algorithm. In: Proceedings of the genetic and evolutionary computation conference, pp 419–427
Elsken T, Metzen JH, Hutter F (2018) Efficient multi-objective neural architecture search via lamarckian evolution. arXiv preprint arXiv:1804.09081
Szegedy C, Liu W, Jia Y, Sermanet P, Reed S, Anguelov D, Erhan D, Vanhoucke V, Rabinovich A (2015) Going deeper with convolutions. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1–9
Zhang X, Zhou X, Lin M, Sun J (2018) ShuffleNet: an extremely efficient convolutional neural network for mobile devices. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 6848–6856
Tan M, Chen B, Pang R, Vasudevan V, Sandler M, Howard A, Le QV (2019) MnasNet: platform-aware neural architecture search for mobile. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 2820–2828
Zhou D, Zhou X, Zhang W, Loy CC, Yi S, Zhang X, Ouyang W (2020) EcoNAS: finding proxies for economical neural architecture search. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 11396–11404
Acknowledgements
The work presented in this paper was supported by the National Natural Science Foundation of China (Grant No. 62273357, 61860206014), National Key Research and Development Program of China (2022YFC2904502), Hunan Provincial Natural Science Foundation of China (Grant No. 2021JJ20082).
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Du, Y., Zhou, X., Huang, T. et al. A hierarchical evolution of neural architecture search method based on state transition algorithm. Int. J. Mach. Learn. & Cyber. 14, 2723–2738 (2023). https://doi.org/10.1007/s13042-023-01794-w
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s13042-023-01794-w