Abstract
The aim of this work is to search for a Convolutional Neural Network (CNN) architecture that performs optimally across all factors, including accuracy, memory footprint, and computing time, suitable for mobile devices. Although deep learning has evolved for use on devices with minimal resources, its implementation is hampered by that these devices are not designed to tackle complex tasks, such as CNN architectures. To address this limitation, a Network Architecture Search (NAS) strategy is considered, which employs a Multi-Objective Evolutionary Algorithm (MOEA) to create an efficient and robust CNN architecture by focusing on three objectives: fast processing times, reduced storage, and high accuracy. Furthermore, we proposed a new Efficient CNN Population Initialization (ECNN-PI) method that utilizes a combination of random and selected strong models to generate the first-generation population. To validate the proposed method, CNN models are trained using CIFAR-10, CIFAR-100, ImageNet, STL-10, FOOD-101, THFOOD-50, FGVC Aircraft, DTD, and Oxford-IIIT Pets benchmark datasets. The MOEA-Net algorithm outperformed other models on CIFAR-10, whereas MOEANet with the ECNN-PI method outperformed other models on CIFAR-10 and CIFAR-100. Furthermore, both the MOEA-Net algorithm and MOEA-Net with the ECNN-PI method outperformed DARTS, P-DARTS, and Relative-NAS for small-scale multi-class and fine-grained datasets.
Similar content being viewed by others
Data Availability
Data sharing not applicable to this article as no datasets were generated or analyzed during the current study.
References
Baldominos A, Saez Y, Isasi P (2017) Evolutionary convolutional neural networks: an application to handwriting recognition. Neurocomputing 283:38–52
Bossard L, Guillaumin M, Gool LJV (2014) Food-101 – mining discriminative components with random forests. In: Computer vision – ECCV 2014 : 13th European Conference, Zurich, Switzerland, September 6-12, 2014, Proceedings, Part VI, vol 8694, pp 446-461
Chen X, Xie L, Wu J, Tian Q (2019) Progressive differentiable architecture search: bridging the depth gap between search and evaluation. In: 2019 IEEE/CVF international conference on computer vision (ICCV), pp 1294–1303
Chollet F (2017) Xception: deep learning with Depthwise separable convolutions. In: 2017 IEEE conference on computer vision and pattern recognition (CVPR), pp 1800–1807
Cimpoi M, Maji S, Kokkinos I, Mohamed S, Vedaldi A (2014) Describing textures in the wild. In: CVPR ’14 Proceedings of the 2014 IEEE conference on computer vision and pattern recognition, pp 3606–3613
Coates A, Ng AY, Lee H (2011) An analysis of single-layer networks in unsupervised feature learning. International Conference on Artificial Intelligence and Statistics 15:215–223
Cubuk ED, Zoph B, Shlens J, Le QV (2020) Randaugment: practical automated data augmentation with a reduced search space. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition workshops, pp 702–703
Darlow LN, Crowley EJ, Antoniou A, Storkey AJ (2018) CINIC-10 Is Not ImageNet or CIFAR-10. arXiv:181003505
Deb K, Pratap A, Agarwal S, Meyarivan T (2002) A fast and elitist multiobjective genetic algorithm: NSGA-II. IEEE Trans Evol Comput 6 (2):182–197
Dong H, Sun J, Sun X, Ding R (2020) A many-objective feature selection for multi-label classification. Knowl Based Syst 208:106456
Elsken T, Metzen JH, Hutter F (2018) Efficient multi-objective neural architecture search via lamarckian evolution. In: International conference on learning representations
He C, Ye H, Shen L, Zhang T (2020) MiLeNAS: efficient neural architecture search via mixed-level reformulation. In: 2020 IEEE/CVF conference on computer vision and pattern recognition (CVPR), pp 11993–12002
He K, Zhang X, Ren S, Sun J (2016a) Deep residual learning for image recognition. In: 2016 IEEE conference on computer vision and pattern recognition (CVPR), pp 770–778
He, K, Zhang X, Ren S, Sun J (2016b) Identity mappings in deep residual networks. In: European conference on computer vision, pp 630–645
Hornakova A, Henschel R, Rosenhahn B, Swoboda P (2020) Lifted disjoint paths with application in multiple object tracking. In: ICML 2020: 37th international conference on machine learning
Howard A, Pang R, Adam H, Le Q, Sandler M, Chen B, Wang W, Chen LC, Tan M, Chu G, Vasudevan V, Zhu Y (2019) Searching for MobileNetV3. In: 2019 IEEE/CVF International conference on computer vision (ICCV), pp 1314–1324
Howard AG, Zhu M, Chen B, Kalenichenko D, Wang W, Weyand T, Andreetto M, Adam H (2017) MobileNets: efficient convolutional neural networks for mobile vision applications. arXiv:170404861
Hu J, Shen L, Sun G (2018) Squeeze-and-excitation networks. In: 2018 IEEE/CVF conference on computer vision and pattern recognition, pp 7132–7141
Iandola FN, Han S, Moskewicz MW, Ashraf K, Dally WJ, Keutzer K (2016) SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and < 0.5MB model size. arXiv:160207360
Krause J, Stark M, Deng J, Fei-Fei L (2013) 3D Object representations for fine-grained categorization. In: 2013 IEEE international conference on computer vision workshops, pp 554–561
Krizhevsky A (2009) Learning multiple layers of features from tiny images
Krizhevsky A, Sutskever I, Hinton GE (2017) ImageNet classification with deep convolutional neural networks. Communications of The ACM 60(6):84–90
LeCun Y, Bengio Y, Hinton G (2015) Deep learning. Nature 521(7553):436–444
Liu C, Zoph B, Neumann M, Shlens J, Hua W, Li LJ, Fei-Fei L, Yuille AL, Huang J, Murphy K (2018a) Progressive neural architecture search. In: Proceedings of the European Conference on Computer Vision (ECCV), pp 19–35
Liu, H, Simonyan K, Vinyals O, Fernando C, Kavukcuoglu K (2018b) Hierarchical representations for efficient architecture search. In: International conference on learning representations
Liu H, Simonyan K, Yang Y (2018c) DARTS: differentiable architecture search. In: International conference on learning representations
Lu Z, Whalen I, Boddeti V, Dhebar Y, Deb K, Goodman E, Banzhaf W (2019) NSGA-Net: neural architecture search using multi-objective genetic algorithm. In: Proceedings of the genetic and evolutionary computation conference on, pp 419–427
Maji S, Rahtu E, Kannala J, Blaschko MB, Vedaldi A (2013) Fine-Grained visual classification of aircraft. arXiv:13065151
Moyano JM, Gibaja EL, Cios KJ, Ventura S (2020) Combining multi-label classifiers based on projections of the output space using evolutionary algorithms. Knowl Based Syst 196:105770
Nilsback ME, Zisserman A (2008) Automated flower classification over a large number of classes. In: 2008 Sixth Indian conference on computer vision, graphics & image processing, pp 722–729
Parkhi OM, Vedaldi A, Zisserman A, Jawahar CV (2012) Cats and dogs. In: 2012 IEEE conference on computer vision and pattern recognition, pp 3498–3505
Paszke A, Gross S, Massa F, Lerer A, Bradbury J, Chanan G, Killeen T, Lin Z, Gimelshein N, Antiga L, Desmaison A, Kopf A, Yang E, DeVito Z, Raison M, Tejani A, Chilamkurthy S, Steiner B, Fang L, Bai J, Chintala S (2019) PyTorch: An imperative style, high-performance deep learning library. In: Advances in Neural Information Processing Systems, pp 8026–8037
Pham H, Guan M, Zoph B, Le Q, Dean J (2018) Efficient neural architecture search via parameters sharing. PMLR, Stockholmsmässan, Stockholm Sweden, Proceedings of Machine Learning Research 80:4095–4104
Real E, Moore S, Selle A, Saxena S, Suematsu YL, Tan J, Le QV, Kurakin A (2017) Large-scale evolution of image classifiers. In: ICML’17 Proceedings of the 34th International Conference on Machine Learning - vol 70, pp 2902-2911
Real E, Aggarwal A, Huang Y, Le QV (2019) Regularized evolution for image classifier architecture search. Proceedings of the AAAI Conference on Artificial Intelligence 33(1):4780–4789
Russakovsky O, Deng J, Su H, Krause J, Satheesh S, Ma S, Huang Z, Karpathy A, Khosla A, Bernstein M, Berg A C, Fei-Fei L (2015) ImageNet large scale visual recognition challenge. Int J Comput Vis 115(3):211–252
Sandler M, Howard A. Zhu, Zhmoginov A, Chen LC (2018) MobileNetV2: inverted residuals and linear bottlenecks. In: 2018 IEEE/CVF conference on computer vision and pattern recognition, pp 4510–4520
Szegedy C, Liu W, Jia Y, Sermanet P, Reed S, Anguelov D, Erhan D, Vanhoucke V, Rabinovich A (2015) Going deeper with convolutions. In: 2015 IEEE conference on computer vision and pattern recognition (CVPR), pp 1–9
Tan H, Cheng R, Huang S, He C, Qiu C, Yang F, Luo P (2021) Relative NAS: relative neural architecture search via slow-fast learning. IEEE Trans Neural Netw, pp 1–15
Tan M, Le QV (2019) EfficientNet: rethinking model scaling for convolutional neural networks. In: International conference on machine learning, pp 6105–6114
Tan M, Chen B, Pang R, Vasudevan V, Sandler M, Howard A, Le QV (2019) MnasNet: Platform-aware neural architecture search for mobile. In: 2019 IEEE/CVF conference on computer vision and pattern recognition (CVPR), pp 2820–2828
Termritthikun C, Kanprachar S (2017) Accuracy improvement of Thai food image recognition using deep convolutional neural networks. In: 2017 International electrical engineering congress (iEECON), pp 1–4
Termritthikun C, Jamtsho Y, Muneesawang P (2019a) On-device facial verification using NUF-Net model of deep learning. Eng Appl Artif Intell 85:579–589
Termritthikun C, Kanprachar S, Muneesawang P (2019b) NU-LiteNet: mobile landmark recognition using convolutional neural networks. In: ECTI transactions on computer and information technology (ECTI-CIT), vol 13, pp 21–28
Termritthikun C, Jamtsho Y, Muneesawang P (2020) An improved residual network model for image recognition using a combination of snapshot ensembles and the cutout technique. Multimed Tools Appl 79(1):1475–1495
Termritthikun C, Jamtsho Y, Ieamsaard J, Muneesawang P, Lee I (2021) EEEA-Net: An early exit evolutionary neural architecture search. Eng Appl Artif Intel 104:104397
Umer A, Termritthikun C, Qiu T, Leong PHW, Lee I (2022) On-Device saliency prediction based on Pseudoknowledge distillation. IEEE Trans Industr Inform 18(9):6317–6325
Wu B, Keutzer K, Dai X, Zhang P, Wang Y, Sun F, Wu Y, Tian Y, Vajda P, Jia Y (2019) FBNet: hardware-aware efficient ConvNet design via differentiable neural architecture search. In: 2019 IEEE/CVF conference on computer vision and pattern recognition (CVPR), pp 10734–10742
Wu Y, He K (2018) Group normalization. In: Proceedings of the European conference on computer vision (ECCV), pp 3–19
Xie L, Yuille A (2017) Genetic CNN. In: 2017 IEEE international conference on computer vision (ICCV)
Xu Y, Xie L, Zhang X, Chen X, Qi GJ, Tian Q, Xiong H (2020) PC-DARTS: partial channel connections for memory-efficient architecture search. In: ICLR 2020 : Eighth international conference on learning representations
Yan M, Zhao M, Xu Z, Zhang Q, Wang G, Su Z (2019) VarGFaceNet: an efficient variable group convolutional neural network for lightweight face recognition. In: 2019 IEEE/CVF international conference on computer vision workshop (ICCVW), pp 2647–2654
Yang Z, Wang Y, Chen X, Shi B, Xu C, Xu C, Tian Q, Xu C (2020) CARS: continuous evolution for efficient neural architecture search. In: 2020 IEEE/CVF conference on computer vision and pattern recognition (CVPR), pp 1829–1838
Yu H, Peng H (2020) Cyclic Differentiable Architecture Search. arXiv:200610724
Yun S, Han D, Chun S, Oh SJ, Yoo Y, Choe J (2019) CutMix: regularization strategy to train strong classifiers with localizable features. In: 2019 IEEE/CVF international conference on computer vision (ICCV), pp 6023–6032
Zhang F, Zhu X, Dai H, Ye M, Zhu C (2020) Distribution-aware coordinate representation for human pose estimation. In: 2020 IEEE/CVF conference on computer vision and pattern recognition (CVPR), pp 7093–7102
Zhong Z, Lin ZQ, Bidart R, Hu X, Daya IB, Li Z, Zheng WS, Li J, Wong A (2020) Squeeze-and-attention networks for semantic segmentation. In: 2020 IEEE/CVF conference on computer vision and pattern recognition (CVPR), pp 13065–13074
Zhu H, An Z, Yang C, Xu K, Zhao E, Xu Y (2019) EENA: efficient evolution of neural architecture. In: 2019 IEEE/CVF international conference on computer vision workshop (ICCVW
Zoph B, Le QV (2016) Neural Architecture Search with Reinforcement Learning. In: ICLR
Acknowledgements
The authors would like to acknowledge the financial support from the Thailand Research Fund through the Royal Golden Jubilee PhD. Program (Grant No. PHD/0101/2559). The study utilizes Australia’s National Computational Infrastructure (NCI) under the National Computational Merit Allocation Scheme (NCMAS). We would like to extend our appreciation to Mr. Roy I. Morien of the Naresuan University Graduate School for his assistance in editing the English grammar and expression in the paper. Finally, we would like to thank Mr. Ayaz Umer and Ms. Suwichaya Suwanwimolkul for their unwavering support, without which the research would not have been successful.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of Interests
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Appendix
Appendix
1.1 Implementation Details
In this section, we summarized the hyperparameters of all experiments, including search space, search strategies, and evaluation, as shown in Table 7. In search space, the hyperparameters defined by NSGA-Net, with only initial channels reduced to 32 with CIFAR-10 and CIFAR-100. We also extended channel increments and squeeze and excitement used in NSGA-Net to our ImageNet evaluation, and the remaining hyperparameters are set as DARTS.
1.2 Architecture visualization
In this section, we visualized the architectures obtained by searching for MOEA-Net and MOEA-Net with ECNN-PI, as shown in Fig. 10. These cells are the most reliable, minimizing the architecture of three objectives in the entire population.
1.3 Ablation analysis
This appendix describes the ablation analysis, which investigates different environmental evolution scenarios by varying the EA’s population and generation parameters. In this study, we divided the experiment into three sections. First, the population consists of 20 generations and 30 individuals. Second, the population consists of 30 generations and 40 individuals. Finally, the population consists of 40 generations and 50 individuals. We discovered each best solution model using MOEA and MOEA with ECNN-PI in the experiment on the CIFAR-100 dataset. All hyperparameters in this experiment were the same as in Table 7.
According to the search results from 20 generations with 30 individuals and 50 generations with 40 individuals in Table 8, the ECNN-PI-based MOEA outperforms MOEA in terms of accuracy and search cost, while the complexity and number of parameters of the MOEA are lower than MOEA with ECNN-PI.
MOEA’s indicators, such as FLOPS, number of parameters, and search cost, outperform MOEA with ECNN-PI. In the experimental results from 30 generations with 40 individuals, the ECNN-PI-based MOEA only outperforms MOEA in terms of accuracy.
Based on the results, it is confirmed that using ECNN-PI to generate the first generation of populations with previously discovered gene models can allow evolutionary algorithms to converge faster than truly randomization populations of the first generation. Moreover, when ECNN-PI is applied to MOEA, we discovered that different environmental evolution settings might influence FLOPS, the number of parameters, and the search cost. Furthermore, finding the best population and generation values for an EA is crucial when considering the results of various objectives.
The genes of the ECNN model were searched and evaluated using standard datasets CIFAR-10, CIFAR-100, and ImageNet to initialize the first generation of populations. However, when applied to non-standard or specialized datasets, ECNN-PI may have limitations such as low accuracy.
We recommend conducting several experiments to find the ECNN model and then using the results of each experiment to generate a first-generation population to optimize the performance of the models.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Termritthikun, C., Jamtsho, Y., Muneesawang, P. et al. Evolutionary neural architecture search based on efficient CNN models population for image classification. Multimed Tools Appl 82, 23917–23943 (2023). https://doi.org/10.1007/s11042-022-14187-y
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-022-14187-y