A hierarchical evolution of neural architecture search method based on state transition algorithm

Du, Yangyi; Zhou, Xiaojun; Huang, Tingwen; Yang, Chunhua

doi:10.1007/s13042-023-01794-w

A hierarchical evolution of neural architecture search method based on state transition algorithm

Original Article
Published: 13 February 2023

Volume 14, pages 2723–2738, (2023)
Cite this article

International Journal of Machine Learning and Cybernetics Aims and scope Submit manuscript

Yangyi Du¹,
Xiaojun Zhou¹,
Tingwen Huang² &
…
Chunhua Yang¹

379 Accesses
2 Citations
1 Altmetric
Explore all metrics

Abstract

Among different search strategies to realize neural architecture search (NAS), evolutionary algorithms (EAs) have gained much attention due to their global optimization capability. However, the large number of performance evaluation procedure makes the EA-based methods extremely time-consuming. To address this issue, an efficient framework called HENA (Hierarchical Evolution of Neural Architecture) is proposed in this paper. In HENA, NAS is hierarchically divided into two continuous phases, candidate operation search and connection relationship search. For this purpose, a supernet is defined to subsume the whole search space, where all weights can be inherited by all architectures. A novel evolutionary algorithm called state transition algorithm (STA) is used to traverse the search space continuously. In the first phase, several sampled architectures are trained on mini-group data to search better operation for different positions within the network. In the second phase, better connection relationship among operations can be directly determined without training via inheriting trained weights. Finally, the proposed method is evaluated on the widely used datasets, and the experimental results show that (1) the architecture learned by HENA obtains state-of-the-art performance (2.93% and 20.44% error rate on CIFAR-10 and CIFAR-100 respectively). (2) the learned architecture achieves 24.8% top-1 error on ImageNet dataset.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 2

Multi-objective Evolution for Deep Neural Network Architecture Search

Progressive Neural Architecture Search

ESAE: Evolutionary Strategy-Based Architecture Evolution

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

Data availability

Some or all data, models, or code generated or used during the study are available from the corresponding author by request.

References

Zhou X, Gao Y, Li C, Huang Z (2021) A multiple gradient descent design for multi-task learning on edge computing: multi-objective machine learning approach. IEEE Trans Netw Sci Eng 9(1):121-133
Schmidhuber J (2015) Deep learning in neural networks: an overview. Neural Netw 61:85–117
Article Google Scholar
Sutskever I, Vinyals O, Le QV (2014) Sequence to sequence learning with neural networks. In: Advances in neural information processing systems, pp 3104–3112
Huang G, Liu Z, Van Der Maaten L, Weinberger KQ (2017) Densely connected convolutional networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 4700–4708
He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 770–778
Simonyan K, Zisserman A (2014) Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556
Hu J, Shen L, Sun G (2018) Squeeze-and-excitation networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 7132–7141
Howard AG, Zhu M, Chen B, Kalenichenko D, Wang W, Weyand T, Andreetto M, Adam H (2017) MobileNets: efficient convolutional neural networks for mobile vision applications. arXiv preprint arXiv:1704.04861
Sandler M, Howard A, Zhu M, Zhmoginov A, Chen L-C (2018) MobileNetV2: inverted residuals and linear bottlenecks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 4510–4520
Real E, Aggarwal A, Huang Y, Le QV (2019) Regularized evolution for image classifier architecture search. In: Proceedings of the AAAI conference on artificial intelligence, vol 33, pp 4780–4789
Elsken T, Metzen JH, Hutter F (2019) Neural architecture search: a survey. J Mach Learn Res 20(1):1997–2017
MathSciNet MATH Google Scholar
Zoph B, Le QV (2016) Neural architecture search with reinforcement learning. arXiv preprint arXiv:1611.01578
Guo J, Han K, Wang Y, Zhang C, Yang Z, Wu H, Chen X, Xu C (2020) Hit-detector: hierarchical trinity architecture search for object detection. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 11405–11414
Dong H, Zou B, Zhang L, Zhang S (2020) Automatic design of CNNS via differentiable neural architecture search for PolSAR image classification. IEEE Trans Geosci Remote Sens 58(9):6362–6375
Article Google Scholar
Liu Y, Sun Y, Xue B, Zhang M, Yen GG, Tan KC (2021) A survey on evolutionary neural architecture search. IEEE Trans Neural Netw Learn Syst
Xie L, Yuille A (2017) Genetic CNN. In: Proceedings of the IEEE international conference on computer vision, pp 1379–1388
Real E, Moore S, Selle A, Saxena S, Suematsu YL, Tan J, Le QV, Kurakin A (2017) Large-scale evolution of image classifiers. In: International conference on machine learning, PMLR, pp 2902–2911
Song D, Xu C, Jia X, Chen Y, Xu C, Wang Y (2020) Efficient residual dense block search for image super-resolution. In: Proceedings of the AAAI conference on artificial intelligence, vol 34, pp 12007–12014
Sun Y, Xue B, Zhang M, Yen GG, Lv J (2020) Automatically designing CNN architectures using the genetic algorithm for image classification. IEEE Trans Cybern 50(9):3840–3854
Article Google Scholar
Zoph B, Vasudevan V, Shlens J, Le QV (2018) Learning transferable architectures for scalable image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 8697–8710
Zhong Z, Yang Z, Deng B, Yan J, Wu W, Shao J, Liu C (2021) BlockQNN: efficient block-wise neural network architecture generation. IEEE Trans Pattern Anal Mach Intell 43(7):2314–2328
Article Google Scholar
Liu H, Simonyan K, Yang Y (2018) Darts: differentiable architecture search. arXiv preprint arXiv:1806.09055
Sun Y, Xue B, Zhang M, Yen GG (2019) Completely automated CNN architecture design based on blocks. IEEE Trans Neural Netw Learn Syst 31(4):1242–1254
Article MathSciNet Google Scholar
Zhou X, Gao DY, Yang C, Gui W (2016) Discrete state transition algorithm for unconstrained integer optimization problems. Neurocomputing 173:864–874
Article Google Scholar
Zhou X, Yang C, Gui W (2012) State transition algorithm. J Ind Manag Optim 33(12):1039–1056
Article MathSciNet MATH Google Scholar
Zhou X, Yang C, Gui W (2019) A statistical study on parameter selection of operators in continuous state transition algorithm. IEEE Trans Cybern 49(10):3722–3730
Article Google Scholar
Yang C, Tang X, Zhou X, Gui W (2013) A discrete state transition algorithm for traveling salesman problem. Control Theory Appl 30(8):1040–1046
Google Scholar
Dong T, Yang C, Zhou X, Gui W (2016) A novel discrete state transition algorithm for staff assignment problem. Control Theory Appl 33(10):1378–1388
MATH Google Scholar
Zhou X, Gao DY, Simpson AR (2016) Optimal design of water distribution networks by a discrete state transition algorithm. Eng Optim 48(4):603–628
Article Google Scholar
Zhou X, Yang K, Xie Y, Yang C, Huang T (2019) A novel modularity-based discrete state transition algorithm for community detection in networks. Neurocomputing 334:89–99
Article Google Scholar
Huang Z, Yang C, Zhou X, Huang T (2019) A hybrid feature selection method based on binary state transition algorithm and reliefF. IEEE J Biomed Health Inform 23(5):1888–1898
Article Google Scholar
Zhou X, Zhang R, Yang C et al (2020) A hybrid feature selection method for production condition recognition in froth flotation with noisy labels. Miner Eng 153:106201
Article Google Scholar
Cai H, Zhu L, Han S (2018) ProxylessNas: direct neural architecture search on target task and hardware. arXiv preprint arXiv:1812.00332
Pham H, Guan M, Zoph B, Le Q, Dean J (2018) Efficient neural architecture search via parameters sharing. In: International conference on machine learning, PMLR, pp 4095–4104
Xie S, Zheng H, Liu C, Lin L (2018) SNAS: stochastic neural architecture search. arXiv preprint arXiv:1812.09926
Wei C, Niu C, Tang Y, Wang Y, Hu H, Liang J (2022) NPENAS: neural predictor guided evolution for neural architecture search. IEEE Trans Neural Netw Learn Syst
Sun Y, Wang H, Xue B, Jin Y, Yen GG, Zhang M (2019) Surrogate-assisted evolutionary deep learning using an end-to-end random forest-based performance predictor. IEEE Trans Evol Comput 24(2):350–364
Article Google Scholar
Tang Y, Wang Y, Xu Y, Chen H, Shi B, Xu C, Xu C, Tian Q, Xu C (2020) A semi-supervised assessor of neural architectures. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 1810–1819
Guan C, Wang X, Zhu W (2021) AutoAttend: automated attention representation search. In: International conference on machine learning, PMLR, pp 3864–3874
Liu S, Zhang H, Jin Y (2022) A survey on computationally efficient neural architecture search. J Autom Intell 1(1):100002
Google Scholar
Guo Z, Zhang X, Mu H, Heng W, Liu Z, Wei Y, Sun J (2020) Single path one-shot neural architecture search with uniform sampling. In: European conference on computer vision, Springer, pp 544–560
Bender G, Kindermans P-J, Zoph B, Vasudevan V, Le Q (2018) Understanding and simplifying one-shot architecture search. In: International conference on machine learning, PMLR, pp 550–559
Brock A, Lim T, Ritchie JM, Weston N (2017) SMASH: one-shot model architecture search through hypernetworks. arXiv preprint arXiv:1708.05344
Wu B, Dai X, Zhang P, Wang Y, Sun F, Wu Y, Tian Y, Vajda P, Jia Y, Keutzer K (2019) FBNet: hardware-aware efficient ConvNET design via differentiable neural architecture search. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 10734–10742
Zhang X, Huang Z, Wang N, Xiang S, Pan C (2020) You only search once: single shot neural architecture search via direct sparse optimization. IEEE Trans Pattern Anal Mach Intell 43(9):2891–2904
Article Google Scholar
Wang R, Cheng M, Chen X, Tang X, Hsieh C-J (2021) Rethinking architecture selection in differentiable NAS. In: International conference on learning representation
Yang Y, Li H, You S, Wang F, Qian C, Lin Z (2020) ISTA-NAS: efficient and consistent neural architecture search by sparse coding. Adv Neural Inf Process Syst 33:10503–10513
Google Scholar
Veniat T, Denoyer L (2018) Learning time/memory-efficient deep architectures with budgeted super networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3492–3500
Wang Q, Wu B, Zhu P, Li P, Zuo W, Hu Q (2020) ECA-Net: efficient channel attention for deep convolutional neural networks. In: 2020 IEEE/CVF conference on computer vision and pattern recognition, pp 11531–11539
Ying C, Klein A, Christiansen E, Real E, Murphy K, Hutter F (2019) NAS-Bench-101: towards reproducible neural architecture search. In: International conference on machine learning, PMLR, pp 7105–7114
Szegedy C, Ioffe S, Vanhoucke V, Alemi AA (2017) Inception-v4, inception-ResNet and the impact of residual connections on learning. In: Proceedings of the AAAI conference on artificial intelligence, pp 4278–4284
Szegedy C, Vanhoucke V, Ioffe S, Shlens J, Wojna Z (2016) Rethinking the inception architecture for computer vision. In: IEEE conference on computer vision and pattern recognition, pp 2818–2826
Howard A, Pang R, Adam H, Le QV, Sandler M, Chen B, Wang W, Chen L, Tan M, Chu G, Vasudevan V, Zhu Y (2019) Searching for MobileNetV3. In: Proceedings of the IEEE international conference on computer vision, pp 1314–1324
Yang Z, Wang Y, Chen X, Shi B, Xu C, Xu C, Tian Q, Xu C (2020) CARS: continuous evolution for efficient neural architecture search. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 1829–1838
Hochreiter S, Schmidhuber J (1997) Long short-term memory. Neural Comput 9(8):1735–1780
Article Google Scholar
Gers FA, Schmidhuber J, Cummins F (2000) Learning to forget: continual prediction with LSTM. Neural Comput 12(10):2451–2471
Article Google Scholar
Hajek B (1988) Cooling schedules for optimal annealing. Math Oper Res 13(2):311–329
Article MathSciNet MATH Google Scholar
Krizhevsky A, Hinton G et al (2009) Learning multiple layers of features from tiny images. Master’s thesis, University of Tront
Deng J, Dong W, Socher R, Li L-J, Li K, Fei-Fei L (2009) ImageNet: a large-scale hierarchical image database. In: 2009 IEEE conference on computer vision and pattern recognition, IEEE, pp 248–255
DeVries T, Taylor GW (2017) Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552
Loshchilov I, Hutter F (2016) SGDR: stochastic gradient descent with warm restarts. arXiv preprint arXiv:1608.03983
Liu C, Zoph B, Neumann M, Shlens J, Hua W, Li L-J, Fei-Fei L, Yuille A, Huang J, Murphy K (2018) Progressive neural architecture search. In: Proceedings of the European conference on computer vision, pp 19–34
Baker B, Gupta O, Naik N, Raskar R (2016) Designing neural network architectures using reinforcement learning. arXiv preprint arXiv:1611.02167
Lu Z, Whalen I, Boddeti V, Dhebar Y, Deb K, Goodman E, Banzhaf W (2019) NSGA-Net: neural architecture search using multi-objective genetic algorithm. In: Proceedings of the genetic and evolutionary computation conference, pp 419–427
Elsken T, Metzen JH, Hutter F (2018) Efficient multi-objective neural architecture search via lamarckian evolution. arXiv preprint arXiv:1804.09081
Szegedy C, Liu W, Jia Y, Sermanet P, Reed S, Anguelov D, Erhan D, Vanhoucke V, Rabinovich A (2015) Going deeper with convolutions. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1–9
Zhang X, Zhou X, Lin M, Sun J (2018) ShuffleNet: an extremely efficient convolutional neural network for mobile devices. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 6848–6856
Tan M, Chen B, Pang R, Vasudevan V, Sandler M, Howard A, Le QV (2019) MnasNet: platform-aware neural architecture search for mobile. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 2820–2828
Zhou D, Zhou X, Zhang W, Loy CC, Yi S, Zhang X, Ouyang W (2020) EcoNAS: finding proxies for economical neural architecture search. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 11396–11404

Download references

Acknowledgements

The work presented in this paper was supported by the National Natural Science Foundation of China (Grant No. 62273357, 61860206014), National Key Research and Development Program of China (2022YFC2904502), Hunan Provincial Natural Science Foundation of China (Grant No. 2021JJ20082).

Author information

Authors and Affiliations

School of Automation, Central South University, Changsha, 410083, China
Yangyi Du, Xiaojun Zhou & Chunhua Yang
Texas A &M University at Qatar, Doha, 10587, Qatar
Tingwen Huang

Authors

Yangyi Du
View author publications
You can also search for this author in PubMed Google Scholar
Xiaojun Zhou
View author publications
You can also search for this author in PubMed Google Scholar
Tingwen Huang
View author publications
You can also search for this author in PubMed Google Scholar
Chunhua Yang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Xiaojun Zhou.

Ethics declarations

Conflict of interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Du, Y., Zhou, X., Huang, T. et al. A hierarchical evolution of neural architecture search method based on state transition algorithm. Int. J. Mach. Learn. & Cyber. 14, 2723–2738 (2023). https://doi.org/10.1007/s13042-023-01794-w

Download citation

Received: 22 March 2022
Accepted: 31 January 2023
Published: 13 February 2023
Issue Date: August 2023
DOI: https://doi.org/10.1007/s13042-023-01794-w

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A hierarchical evolution of neural architecture search method based on state transition algorithm

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Multi-objective Evolution for Deep Neural Network Architecture Search

Progressive Neural Architecture Search

ESAE: Evolutionary Strategy-Based Architecture Evolution

Data availability

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Navigation

A hierarchical evolution of neural architecture search method based on state transition algorithm

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Multi-objective Evolution for Deep Neural Network Architecture Search

Progressive Neural Architecture Search

ESAE: Evolutionary Strategy-Based Architecture Evolution

Explore related subjects

Data availability

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now

Search

Navigation