Abstract
Designing suitable convolutional neural networks (CNNs) for different image data requires much human effort and expertise, in recent years, this process has been greatly accelerated by automatic architecture design methods. However, existing work rarely integrates macro-architecture space with depth search space, which usually leads to suboptimal architecture design results. Also, the adopted search strategy often needs to be specially customized for compatibility with architecture encoding. This paper thus proposes an automatic architecture design method based on monarch butterfly optimization (MBO). Specifically, an expressive Neural Function Unit (NFU) based architecture representation is designed, which integrates promising architectures in GoogLeNet, ResNet and DenseNet to facilitate the joint search of macro-architecture and depth of CNNs. Furthermore, a direct architecture encoding is designed to take advantage of the fast convergent MBO, which exploits evolutionary operators that have no complex computations to continuously improve the architecture population via encoding optimization. Extensive experiments conducted on eight benchmark image datasets demonstrate that our method can achieve continuously competitive performance with much less time and computational overhead.







Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Bacanin N, Bezdan T, Tuba E, Strumberger I, Tuba M (2020) Monarch butterfly optimization based convolutional neural network design. Mathematics 8(6):936
Baker B, Gupta O, Naik N, Raskar R (2017) Designing neural network architectures using reinforcement learning. In: Proceedings of the 5th International Conference on Learning Representations
Cai H, Chen T, Zhang W, Yu Y, Wang J (2018) Efficient architecture search by network transformation. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol 32
Chang X, Nie F, Wang S, Yang Y, Zhou X, Zhang C (2016) Compound rank- \(k\) projections for bilinear analysis. IEEE Transact Neural Networks Learning Syst 27(7):1502–1513
Duan H, Zhao W, Wang G, Feng X (2012) Test-sheet composition using analytic hierarchy process and hybrid metaheuristic algorithm ts/bbo. Math Problems Eng 2012
Feng Y, Wang G-G, Dong J, Wang L (2018) Opposition-based learning monarch butterfly optimization with gaussian perturbation for large-scale 0–1 knapsack problem. Comput Electrical Eng 67:454–468. https://doi.org/10.1016/j.compeleceng.2017.12.014
Gao D, Wang G-G, Pedrycz W (2020) Solving fuzzy job-shop scheduling problem using de algorithm improved by a selection mechanism. IEEE Transact Fuzzy Syst 28(12):3265–3275
Gu Z-M, Wang G-G (2020) Improving nsga-iii algorithms with information feedback models for large-scale many-objective optimization. Future Gener Comput Syst 107:49–69. https://doi.org/10.1016/j.future.2020.01.048
Han K, Wang Y, Tian Q, Guo J, Xu C, Xu C (2020) Ghostnet: more features from cheap operations. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 1580–1589
Han X, Han Y, Chen Q, Li J, Sang H, Liu Y, Pan Q, Nojima Y (2021) Distributed flow shop scheduling with sequence-dependent setup times using an improved iterated greedy algorithm. Complex Syst Model Simul 1(3):198–217
He K, Zhang X, Ren S, Sun J (2016a) Deep residual learning for image recognition. In: Proceedings of the 2016 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pages 770–778
He K, Zhang X, Ren S, Sun J (2016b) Identity mappings in deep residual networks. In: European conference on computer vision, pages 630–645. Springer
Howard AG, Zhu M, Chen B, Kalenichenko D, Wang W, Weyand T, Andreetto M, Adam H (2017) Mobilenets: efficient convolutional neural networks for mobile vision applications. arXiv preprint arXiv:1704.04861
Hua Y, Liu Q, Hao K, Jin Y (2021) A survey of evolutionary algorithms for multi-objective optimization problems with irregular pareto fronts. IEEE/CAA J Autom Sinica 8(2):303–318
Huang G, Liu Z, Van Der ML, Weinberger KQ (2017) Densely connected convolutional networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pages 4700–4708
Junior FEF, Yen GG (2019) Particle swarm optimization of deep neural networks architectures for image classification. Swarm Evol Comput 49:62–74
Kingma DP, Ba JL (2015) Adam: A method for stochastic optimization. In: Proceedings of the 3rd international conference on learning representations, pp 1–15
Krizhevsky A, Hinton G, et al. (2009) Learning multiple layers of features from tiny images
Krizhevsky A, Sutskever I, Hinton GE (2012) Imagenet classification with deep convolutional neural networks. Adv Neural Inf Proces Syst 25:1097–1105
Larochelle H, Erhan D, Courville A, Bergstra J, Bengio Y (2007) An empirical evaluation of deep architectures on problems with many factors of variation. In: Proceedings of the 24th international conference on Machine learning, pages 473–480
Lawrence T, Zhang L, Lim CP, Phillips E-J (2021) Particle swarm optimization for automatically evolving convolutional neural networks for image classification. IEEE Access 9:14369–14386
LeCun Y, Bottou L, Bengio Y, Haffner P (1998) Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11):2278–2324
Li Z, Nie F, Chang X, Nie L, Zhang H, Yang Y (2018a) Rank-constrained spectral clustering with flexible embedding. IEEE Transact Neural Networks Learn Syst 29(12):6073–6082
Li Z, Nie F, Chang X, Yang Y, Zhang C, Sebe N (2018b) Dynamic affinity graph construction for spectral clustering using multiple features. IEEE Transact Neural Networks Learn Syst 29(12):6323–6332
Li Z, Yao L, Chang X, Zhan K, Sun J, Zhang H (2019) Zero-shot event detection via event-adaptive concept relevance mining. Pattern Recognit 88:595–603
Li W, Wang G-G, Alavi AH (2020) Learning-based elephant herding optimization algorithm for solving numerical optimization problems. Knowledge-Based Syst 195:105675. https://doi.org/10.1016/j.knosys.2020.105675
Lin M, Chen Q, Yan S (2013) Network in network. arXiv preprint arXiv:1312.4400
Liu H, Simonyan K, Vinyals O, Fernando C, Kavukcuoglu K (2018) Hierarchical representations for efficient architecture search. In: Proceedings of the 6th International Conference on Learning Representations
Liu H, Simonyan K, Yang Y (2019) Darts: differentiable architecture search. In: Proceedings of the 7th International Conference on Learning Representations
Luo M, Chang X, Nie L, Yang Y, Hauptmann AG, Zheng Q (2018a) An adaptive semisupervised feature analysis for video semantic recognition. IEEE Transact Cybern 48(2):648–660
Luo M, Nie F, Chang X, Yang Y, Hauptmann AG, Zheng Q (2018b) Adaptive unsupervised feature selection with structure regularization. IEEE Transact Neural Networks Learn Syst 29(4):944–956
Ma L, Cheng S, Shi Y (2021a) Enhancing learning efficiency of brain storm optimization via orthogonal learning design. IEEE Transact Syst Man Cybern 51(11):6723–6742
Ma L, Huang M, Yang S, Wang R, Wang X (2021b) An adaptive localized decision variable analysis approach to large-scale multiobjective and many-objective optimization. IEEE Transact Cybern. https://doi.org/10.1109/TCYB.2020.3041212
Real E, Moore S, Selle A, Saxena S, Suematsu YL, Tan J, Le QV, Kurakin A (2017) Large-scale evolution of image classifiers. In: International Conference on Machine Learning, pages 2902–2911
Real E, Aggarwal A, Huang Y, Le QV (2019) Regularized evolution for image classifier architecture search. In: Proceedings of the Aaai Conference on Artificial Intelligence 33:4780–4789
Simonyan K, Zisserman A (2015) Very deep convolutional networks for large-scale image recognition. In: Proceedings of the International Conference on Learning Representations
Suganuma M, Shirakawa S, Nagao T (2017) A genetic programming approach to designing convolutional neural network architectures. In: Proceedings of the Genetic and Evolutionary Computation Conference, pages 497–504
Sun Y, Xue B, Zhang M, Yen GG (2019a) Completely automated cnn architecture design based on blocks. IEEE Transact Neural Networks Learn Syst 31(4):1242–1254
Sun Y, Xue B, Zhang M, Yen GG (2019b) Evolving deep convolutional neural networks for image classification. IEEE Transact Evol Comput 24(2):394–407
Sun Y, Xue B, Zhang M, Yen GG, Lv J (2020) Automatically designing cnn architectures using the genetic algorithm for image classification. IEEE Transact Cybern 50(9):3840–3854
Szegedy C, Liu W, Jia Y, Sermanet P, Reed S, Anguelov D, Erhan D, Vanhoucke V, Rabinovich A (2015) Going deeper with convolutions. In: Proceedings of the 2015 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pages 1–9
Wang G-G, Tan Y (2019) Improving metaheuristic algorithms with information feedback models. IEEE Transact Cyberne 49(2):542–555
Wang B, Sun Y, Xue B, Zhang M (2018) Evolving deep convolutional neural networks by variable-length particle swarm optimization for image classification. In: 2018 IEEE Congress on Evolutionary Computation (CEC), pages 1–8
Wang B, Sun Y, Xue B, Zhang M (2019a) A hybrid ga-pso method for evolving architecture and short connections of deep convolutional neural networks. In: Pacific Rim International Conference on Artificial Intelligence, pages 650–663. Springer
Wang G-G, Deb S, Cui Z (2019b) Monarch butterfly optimization. Neural Comput Appl 31(7):1995–2014
Xiao H, Rasul K, Vollgraf R (2017) Fashion-mnist: a novel image dataset for benchmarking machine learning algorithms. arXiv preprint arXiv:1708.07747
Yan C, Chang X, Li Z, Guan W, Ge Z, Zhu L, Zheng Q (2021) Zeronas: differentiable generative adversarial networks search for zero-shot learning. IEEE Transact Pattern Anal Mach Intell. https://doi.org/10.1109/TPAMI.2021.3127346
Yu E, Sun J, Li J, Chang X, Han X-H, Hauptmann AG (2019) Adaptive semi-supervised feature selection for cross-modal retrieval. IEEE Transact Multimed 21(5):1276–1288
Zagoruyko S, Komodakis N (2016) Wide residual networks. In: Proceedings of the 27th British Machine Vision Conference, pages 1–13
Zhang L, Luo M, Liu J, Chang X, Yang Y, Hauptmann AG (2020a) Deep top-\(k\) ranking for image-sentence matching. IEEE Transact Multimed 22(3):775–785
Zhang Y, Wang G-G, Li K, Yeh W-C, Jian M, Dong J (2020b) Enhancing moea/d with information feedback models for large-scale many-objective optimization. Inform Sci 522:1–16. https://doi.org/10.1016/j.ins.2020.02.066
Zhang W, Hou W, Li C, Yang W, Gen M (2021) Multidirection update-based multiobjective particle swarm optimization for mixed no-idle flow-shop scheduling problem. Complex Syst Model Simul 1(3):176–197
Zhao F, Di S, Cao J, Tang J, Jonrinaldi (2021) A novel cooperative multi-stage hyper-heuristic for combination optimization problems. Complex Syst Model Simul 1(2):91–108
Zhong Z, Yan J, Wu W, Shao J, Liu C-L (2018) Practical block-wise neural network architecture generation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 2423–2432
Zhong G, Jiao W, Gao W, Huang K (2020) Automatic design of deep networks with neural blocks. Cognit Comput 12(1):1–12
Zhou R, Chang X, Shi L, Shen Y-D, Yang Y, Nie F (2020) Person reidentification via multi-feature fusion with adaptive graph learning. IEEE Transact Neural Networks Learn Syst 31(5):1592–1601
Zhu Q-H, Tang H, Huang J-J, Hou Y (2021) Task scheduling for multi-cloud computing subject to security and reliability constraints. IEEE/CAA J Autom Sinica 8(4):848–865
Zoph B, Le QV (2017) Neural architecture search with reinforcement learning. In: Proceedings of the 5th International Conference on Learning Representations, pages 1–16
Zoph B, Vasudevan V, Shlens J, Le QV (2018) Learning transferable architectures for scalable image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 8697–8710
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Wang, Y., Qiao, X. & Wang, GG. Architecture evolution of convolutional neural network using monarch butterfly optimization. J Ambient Intell Human Comput 14, 12257–12271 (2023). https://doi.org/10.1007/s12652-022-03766-4
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s12652-022-03766-4