Skip to main content
Log in

Evolutionary neural architecture search based on efficient CNN models population for image classification

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

The aim of this work is to search for a Convolutional Neural Network (CNN) architecture that performs optimally across all factors, including accuracy, memory footprint, and computing time, suitable for mobile devices. Although deep learning has evolved for use on devices with minimal resources, its implementation is hampered by that these devices are not designed to tackle complex tasks, such as CNN architectures. To address this limitation, a Network Architecture Search (NAS) strategy is considered, which employs a Multi-Objective Evolutionary Algorithm (MOEA) to create an efficient and robust CNN architecture by focusing on three objectives: fast processing times, reduced storage, and high accuracy. Furthermore, we proposed a new Efficient CNN Population Initialization (ECNN-PI) method that utilizes a combination of random and selected strong models to generate the first-generation population. To validate the proposed method, CNN models are trained using CIFAR-10, CIFAR-100, ImageNet, STL-10, FOOD-101, THFOOD-50, FGVC Aircraft, DTD, and Oxford-IIIT Pets benchmark datasets. The MOEA-Net algorithm outperformed other models on CIFAR-10, whereas MOEANet with the ECNN-PI method outperformed other models on CIFAR-10 and CIFAR-100. Furthermore, both the MOEA-Net algorithm and MOEA-Net with the ECNN-PI method outperformed DARTS, P-DARTS, and Relative-NAS for small-scale multi-class and fine-grained datasets.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Algorithm 1
Fig. 7
Fig. 8
Fig. 9

Similar content being viewed by others

Data Availability

Data sharing not applicable to this article as no datasets were generated or analyzed during the current study.

References

  1. Baldominos A, Saez Y, Isasi P (2017) Evolutionary convolutional neural networks: an application to handwriting recognition. Neurocomputing 283:38–52

    Article  Google Scholar 

  2. Bossard L, Guillaumin M, Gool LJV (2014) Food-101 – mining discriminative components with random forests. In: Computer vision – ECCV 2014 : 13th European Conference, Zurich, Switzerland, September 6-12, 2014, Proceedings, Part VI, vol 8694, pp 446-461

  3. Chen X, Xie L, Wu J, Tian Q (2019) Progressive differentiable architecture search: bridging the depth gap between search and evaluation. In: 2019 IEEE/CVF international conference on computer vision (ICCV), pp 1294–1303

  4. Chollet F (2017) Xception: deep learning with Depthwise separable convolutions. In: 2017 IEEE conference on computer vision and pattern recognition (CVPR), pp 1800–1807

  5. Cimpoi M, Maji S, Kokkinos I, Mohamed S, Vedaldi A (2014) Describing textures in the wild. In: CVPR ’14 Proceedings of the 2014 IEEE conference on computer vision and pattern recognition, pp 3606–3613

  6. Coates A, Ng AY, Lee H (2011) An analysis of single-layer networks in unsupervised feature learning. International Conference on Artificial Intelligence and Statistics 15:215–223

    Google Scholar 

  7. Cubuk ED, Zoph B, Shlens J, Le QV (2020) Randaugment: practical automated data augmentation with a reduced search space. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition workshops, pp 702–703

  8. Darlow LN, Crowley EJ, Antoniou A, Storkey AJ (2018) CINIC-10 Is Not ImageNet or CIFAR-10. arXiv:181003505

  9. Deb K, Pratap A, Agarwal S, Meyarivan T (2002) A fast and elitist multiobjective genetic algorithm: NSGA-II. IEEE Trans Evol Comput 6 (2):182–197

    Article  Google Scholar 

  10. Dong H, Sun J, Sun X, Ding R (2020) A many-objective feature selection for multi-label classification. Knowl Based Syst 208:106456

    Article  Google Scholar 

  11. Elsken T, Metzen JH, Hutter F (2018) Efficient multi-objective neural architecture search via lamarckian evolution. In: International conference on learning representations

  12. He C, Ye H, Shen L, Zhang T (2020) MiLeNAS: efficient neural architecture search via mixed-level reformulation. In: 2020 IEEE/CVF conference on computer vision and pattern recognition (CVPR), pp 11993–12002

  13. He K, Zhang X, Ren S, Sun J (2016a) Deep residual learning for image recognition. In: 2016 IEEE conference on computer vision and pattern recognition (CVPR), pp 770–778

  14. He, K, Zhang X, Ren S, Sun J (2016b) Identity mappings in deep residual networks. In: European conference on computer vision, pp 630–645

  15. Hornakova A, Henschel R, Rosenhahn B, Swoboda P (2020) Lifted disjoint paths with application in multiple object tracking. In: ICML 2020: 37th international conference on machine learning

  16. Howard A, Pang R, Adam H, Le Q, Sandler M, Chen B, Wang W, Chen LC, Tan M, Chu G, Vasudevan V, Zhu Y (2019) Searching for MobileNetV3. In: 2019 IEEE/CVF International conference on computer vision (ICCV), pp 1314–1324

  17. Howard AG, Zhu M, Chen B, Kalenichenko D, Wang W, Weyand T, Andreetto M, Adam H (2017) MobileNets: efficient convolutional neural networks for mobile vision applications. arXiv:170404861

  18. Hu J, Shen L, Sun G (2018) Squeeze-and-excitation networks. In: 2018 IEEE/CVF conference on computer vision and pattern recognition, pp 7132–7141

  19. Iandola FN, Han S, Moskewicz MW, Ashraf K, Dally WJ, Keutzer K (2016) SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and < 0.5MB model size. arXiv:160207360

  20. Krause J, Stark M, Deng J, Fei-Fei L (2013) 3D Object representations for fine-grained categorization. In: 2013 IEEE international conference on computer vision workshops, pp 554–561

  21. Krizhevsky A (2009) Learning multiple layers of features from tiny images

  22. Krizhevsky A, Sutskever I, Hinton GE (2017) ImageNet classification with deep convolutional neural networks. Communications of The ACM 60(6):84–90

    Article  Google Scholar 

  23. LeCun Y, Bengio Y, Hinton G (2015) Deep learning. Nature 521(7553):436–444

    Article  Google Scholar 

  24. Liu C, Zoph B, Neumann M, Shlens J, Hua W, Li LJ, Fei-Fei L, Yuille AL, Huang J, Murphy K (2018a) Progressive neural architecture search. In: Proceedings of the European Conference on Computer Vision (ECCV), pp 19–35

  25. Liu, H, Simonyan K, Vinyals O, Fernando C, Kavukcuoglu K (2018b) Hierarchical representations for efficient architecture search. In: International conference on learning representations

  26. Liu H, Simonyan K, Yang Y (2018c) DARTS: differentiable architecture search. In: International conference on learning representations

  27. Lu Z, Whalen I, Boddeti V, Dhebar Y, Deb K, Goodman E, Banzhaf W (2019) NSGA-Net: neural architecture search using multi-objective genetic algorithm. In: Proceedings of the genetic and evolutionary computation conference on, pp 419–427

  28. Maji S, Rahtu E, Kannala J, Blaschko MB, Vedaldi A (2013) Fine-Grained visual classification of aircraft. arXiv:13065151

  29. Moyano JM, Gibaja EL, Cios KJ, Ventura S (2020) Combining multi-label classifiers based on projections of the output space using evolutionary algorithms. Knowl Based Syst 196:105770

    Article  Google Scholar 

  30. Nilsback ME, Zisserman A (2008) Automated flower classification over a large number of classes. In: 2008 Sixth Indian conference on computer vision, graphics & image processing, pp 722–729

  31. Parkhi OM, Vedaldi A, Zisserman A, Jawahar CV (2012) Cats and dogs. In: 2012 IEEE conference on computer vision and pattern recognition, pp 3498–3505

  32. Paszke A, Gross S, Massa F, Lerer A, Bradbury J, Chanan G, Killeen T, Lin Z, Gimelshein N, Antiga L, Desmaison A, Kopf A, Yang E, DeVito Z, Raison M, Tejani A, Chilamkurthy S, Steiner B, Fang L, Bai J, Chintala S (2019) PyTorch: An imperative style, high-performance deep learning library. In: Advances in Neural Information Processing Systems, pp 8026–8037

  33. Pham H, Guan M, Zoph B, Le Q, Dean J (2018) Efficient neural architecture search via parameters sharing. PMLR, Stockholmsmässan, Stockholm Sweden, Proceedings of Machine Learning Research 80:4095–4104

  34. Real E, Moore S, Selle A, Saxena S, Suematsu YL, Tan J, Le QV, Kurakin A (2017) Large-scale evolution of image classifiers. In: ICML’17 Proceedings of the 34th International Conference on Machine Learning - vol 70, pp 2902-2911

  35. Real E, Aggarwal A, Huang Y, Le QV (2019) Regularized evolution for image classifier architecture search. Proceedings of the AAAI Conference on Artificial Intelligence 33(1):4780–4789

  36. Russakovsky O, Deng J, Su H, Krause J, Satheesh S, Ma S, Huang Z, Karpathy A, Khosla A, Bernstein M, Berg A C, Fei-Fei L (2015) ImageNet large scale visual recognition challenge. Int J Comput Vis 115(3):211–252

  37. Sandler M, Howard A. Zhu, Zhmoginov A, Chen LC (2018) MobileNetV2: inverted residuals and linear bottlenecks. In: 2018 IEEE/CVF conference on computer vision and pattern recognition, pp 4510–4520

  38. Szegedy C, Liu W, Jia Y, Sermanet P, Reed S, Anguelov D, Erhan D, Vanhoucke V, Rabinovich A (2015) Going deeper with convolutions. In: 2015 IEEE conference on computer vision and pattern recognition (CVPR), pp 1–9

  39. Tan H, Cheng R, Huang S, He C, Qiu C, Yang F, Luo P (2021) Relative NAS: relative neural architecture search via slow-fast learning. IEEE Trans Neural Netw, pp 1–15

  40. Tan M, Le QV (2019) EfficientNet: rethinking model scaling for convolutional neural networks. In: International conference on machine learning, pp 6105–6114

  41. Tan M, Chen B, Pang R, Vasudevan V, Sandler M, Howard A, Le QV (2019) MnasNet: Platform-aware neural architecture search for mobile. In: 2019 IEEE/CVF conference on computer vision and pattern recognition (CVPR), pp 2820–2828

  42. Termritthikun C, Kanprachar S (2017) Accuracy improvement of Thai food image recognition using deep convolutional neural networks. In: 2017 International electrical engineering congress (iEECON), pp 1–4

  43. Termritthikun C, Jamtsho Y, Muneesawang P (2019a) On-device facial verification using NUF-Net model of deep learning. Eng Appl Artif Intell 85:579–589

    Article  Google Scholar 

  44. Termritthikun C, Kanprachar S, Muneesawang P (2019b) NU-LiteNet: mobile landmark recognition using convolutional neural networks. In: ECTI transactions on computer and information technology (ECTI-CIT), vol 13, pp 21–28

  45. Termritthikun C, Jamtsho Y, Muneesawang P (2020) An improved residual network model for image recognition using a combination of snapshot ensembles and the cutout technique. Multimed Tools Appl 79(1):1475–1495

    Article  Google Scholar 

  46. Termritthikun C, Jamtsho Y, Ieamsaard J, Muneesawang P, Lee I (2021) EEEA-Net: An early exit evolutionary neural architecture search. Eng Appl Artif Intel 104:104397

    Article  Google Scholar 

  47. Umer A, Termritthikun C, Qiu T, Leong PHW, Lee I (2022) On-Device saliency prediction based on Pseudoknowledge distillation. IEEE Trans Industr Inform 18(9):6317–6325

    Article  Google Scholar 

  48. Wu B, Keutzer K, Dai X, Zhang P, Wang Y, Sun F, Wu Y, Tian Y, Vajda P, Jia Y (2019) FBNet: hardware-aware efficient ConvNet design via differentiable neural architecture search. In: 2019 IEEE/CVF conference on computer vision and pattern recognition (CVPR), pp 10734–10742

  49. Wu Y, He K (2018) Group normalization. In: Proceedings of the European conference on computer vision (ECCV), pp 3–19

  50. Xie L, Yuille A (2017) Genetic CNN. In: 2017 IEEE international conference on computer vision (ICCV)

  51. Xu Y, Xie L, Zhang X, Chen X, Qi GJ, Tian Q, Xiong H (2020) PC-DARTS: partial channel connections for memory-efficient architecture search. In: ICLR 2020 : Eighth international conference on learning representations

  52. Yan M, Zhao M, Xu Z, Zhang Q, Wang G, Su Z (2019) VarGFaceNet: an efficient variable group convolutional neural network for lightweight face recognition. In: 2019 IEEE/CVF international conference on computer vision workshop (ICCVW), pp 2647–2654

  53. Yang Z, Wang Y, Chen X, Shi B, Xu C, Xu C, Tian Q, Xu C (2020) CARS: continuous evolution for efficient neural architecture search. In: 2020 IEEE/CVF conference on computer vision and pattern recognition (CVPR), pp 1829–1838

  54. Yu H, Peng H (2020) Cyclic Differentiable Architecture Search. arXiv:200610724

  55. Yun S, Han D, Chun S, Oh SJ, Yoo Y, Choe J (2019) CutMix: regularization strategy to train strong classifiers with localizable features. In: 2019 IEEE/CVF international conference on computer vision (ICCV), pp 6023–6032

  56. Zhang F, Zhu X, Dai H, Ye M, Zhu C (2020) Distribution-aware coordinate representation for human pose estimation. In: 2020 IEEE/CVF conference on computer vision and pattern recognition (CVPR), pp 7093–7102

  57. Zhong Z, Lin ZQ, Bidart R, Hu X, Daya IB, Li Z, Zheng WS, Li J, Wong A (2020) Squeeze-and-attention networks for semantic segmentation. In: 2020 IEEE/CVF conference on computer vision and pattern recognition (CVPR), pp 13065–13074

  58. Zhu H, An Z, Yang C, Xu K, Zhao E, Xu Y (2019) EENA: efficient evolution of neural architecture. In: 2019 IEEE/CVF international conference on computer vision workshop (ICCVW

  59. Zoph B, Le QV (2016) Neural Architecture Search with Reinforcement Learning. In: ICLR

Download references

Acknowledgements

The authors would like to acknowledge the financial support from the Thailand Research Fund through the Royal Golden Jubilee PhD. Program (Grant No. PHD/0101/2559). The study utilizes Australia’s National Computational Infrastructure (NCI) under the National Computational Merit Allocation Scheme (NCMAS). We would like to extend our appreciation to Mr. Roy I. Morien of the Naresuan University Graduate School for his assistance in editing the English grammar and expression in the paper. Finally, we would like to thank Mr. Ayaz Umer and Ms. Suwichaya Suwanwimolkul for their unwavering support, without which the research would not have been successful.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Chakkrit Termritthikun.

Ethics declarations

Conflict of Interests

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix

Appendix

1.1 Implementation Details

In this section, we summarized the hyperparameters of all experiments, including search space, search strategies, and evaluation, as shown in Table 7. In search space, the hyperparameters defined by NSGA-Net, with only initial channels reduced to 32 with CIFAR-10 and CIFAR-100. We also extended channel increments and squeeze and excitement used in NSGA-Net to our ImageNet evaluation, and the remaining hyperparameters are set as DARTS.

1.2 Architecture visualization

In this section, we visualized the architectures obtained by searching for MOEA-Net and MOEA-Net with ECNN-PI, as shown in Fig. 10. These cells are the most reliable, minimizing the architecture of three objectives in the entire population.

Fig. 10
figure 10

Normal and reduction cell architectures found by MOEA-Net and MOEA-Net with ECNN-PI

1.3 Ablation analysis

This appendix describes the ablation analysis, which investigates different environmental evolution scenarios by varying the EA’s population and generation parameters. In this study, we divided the experiment into three sections. First, the population consists of 20 generations and 30 individuals. Second, the population consists of 30 generations and 40 individuals. Finally, the population consists of 40 generations and 50 individuals. We discovered each best solution model using MOEA and MOEA with ECNN-PI in the experiment on the CIFAR-100 dataset. All hyperparameters in this experiment were the same as in Table 7.

According to the search results from 20 generations with 30 individuals and 50 generations with 40 individuals in Table 8, the ECNN-PI-based MOEA outperforms MOEA in terms of accuracy and search cost, while the complexity and number of parameters of the MOEA are lower than MOEA with ECNN-PI.

MOEA’s indicators, such as FLOPS, number of parameters, and search cost, outperform MOEA with ECNN-PI. In the experimental results from 30 generations with 40 individuals, the ECNN-PI-based MOEA only outperforms MOEA in terms of accuracy.

Based on the results, it is confirmed that using ECNN-PI to generate the first generation of populations with previously discovered gene models can allow evolutionary algorithms to converge faster than truly randomization populations of the first generation. Moreover, when ECNN-PI is applied to MOEA, we discovered that different environmental evolution settings might influence FLOPS, the number of parameters, and the search cost. Furthermore, finding the best population and generation values for an EA is crucial when considering the results of various objectives.

Table 7 A summary of hyperparameters

The genes of the ECNN model were searched and evaluated using standard datasets CIFAR-10, CIFAR-100, and ImageNet to initialize the first generation of populations. However, when applied to non-standard or specialized datasets, ECNN-PI may have limitations such as low accuracy.

We recommend conducting several experiments to find the ECNN model and then using the results of each experiment to generate a first-generation population to optimize the performance of the models.

Table 8 Ablation analysis: comparison of different environmental evolution scenarios by varying the EA’s population and generation parameters

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Termritthikun, C., Jamtsho, Y., Muneesawang, P. et al. Evolutionary neural architecture search based on efficient CNN models population for image classification. Multimed Tools Appl 82, 23917–23943 (2023). https://doi.org/10.1007/s11042-022-14187-y

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-022-14187-y

Keywords

Navigation