Enhancing CNN structure and learning through NSGA-II-based multi-objective optimization

Elghazi, Khalid; Ramchoun, Hassan; Masrour, Tawfik

doi:10.1007/s12530-024-09574-9

Enhancing CNN structure and learning through NSGA-II-based multi-objective optimization

Original Paper
Published: 01 April 2024

Volume 15, pages 1503–1519, (2024)
Cite this article

Evolving Systems Aims and scope Submit manuscript

Khalid Elghazi¹,
Hassan Ramchoun² &
Tawfik Masrour^1,3

340 Accesses
Explore all metrics

Abstract

In recent years, the advancement of convolutional neural networks (CNNs) has been driven by the pursuit of higher classification accuracy in image tasks. However, achieving optimal performance often requires extensive manual design, incorporating domain-specific knowledge and problem-understanding. This approach often results in highly complex network architectures, overlooking the potential drawbacks of such complexity. To this end, we propose MOGA-CNN, a Multi-Objective Genetic Algorithm for CNN structure that treats the CNN architecture design as a bi-objective optimization problem. MOGA-CNN aims to simultaneously optimize classification accuracy and minimize computational complexity, as measured by the number of learnable parameters. We employ the NSGA-II algorithm to effectively explore the trade-offs between these two conflicting objectives. The main contribution of this paper is the development of an encoding mechanism that captures the essential hyperparameters that influence CNN architecture, including the fully connected layer. To evaluate the effectiveness of our proposed algorithm, we conducted extensive experiments on four datasets, comparing its performance against other state-of-the-art methods. The results consistently demonstrate that our approach achieves satisfactory results when compared to these approaches.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 4

A New Multi-objective Optimization Model for Optimal Configuration of CNNs

Convolutional Neural Networks: Architecture Optimization and Regularization

A Two-Stage Efficient Evolutionary Neural Architecture Search Method for Image Classification

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

Data availability

The datasets used in this study are publicly available from the following sources: MNIST: The MNIST dataset can be obtained from the official website of the MNIST database (http://yann.lecun.com/exdb/mnist/). Fashion MNIST: The Fashion MNIST dataset is accessible from the Zalando Research GitHub repository (https://github.com/zalandoresearch/fashion-mnist). CIFAR-10: The CIFAR-10 dataset is available on the official CIFAR website (https://www.cs.toronto.edu/ kriz/cifar.html). SVHN (Street View House Numbers): The SVHN dataset can be obtained from the Stanford University Street View House Numbers dataset page (http://ufldl.stanford.edu/housenumbers/).

References

Bottou L (2010) Large-scale machine learning with stochastic gradient descent. In: Proceedings of COMPSTAT’2010: 19th International Conference on Computational Statistics Paris France, August 22–27, 2010 Keynote, Invited and Contributed Papers. Springer. pp 177–186
Cireşan D, Meier U, Masci J, Schmidhuber J (2012) Multi-column deep neural network for traffic sign classification. Neural Netw 32:333–338
Article Google Scholar
Coello CAC, Pulido GT, Lechuga MS (2004) Handling multiple objectives with particle swarm optimization. IEEE Trans Evol Comput 8(3):256–279
Article Google Scholar
Deb K, Pratap A, Agarwal S, Meyarivan T (2002) A fast and elitist multiobjective genetic algorithm: Nsga-ii. IEEE Trans Evol Comput 6(2):182–197
Article Google Scholar
Dong J-D, Cheng A-C, Juan D-C, Wei W, Sun M (2018a) Dpp-net: Device-aware progressive search for pareto-optimal neural architectures. In: Proceedings of the European Conference on Computer Vision (ECCV), pp 517–531
Dong J-D, Cheng A-C, Juan D-C, Wei W, Sun M (2018b) Ppp-net: Platform-aware progressive search for pareto-optimal neural architectures
Eichfelder G (2010) Multiobjective bilevel optimization. Math Program 123:419–449
Article MathSciNet Google Scholar
Glorot X, Bordes A, Bengio Y (2011) Deep sparse rectifier neural networks. In: Proceedings of the Fourteenth International Conference on Artificial Intelligence and Statistics. JMLR Workshop and Conference Proceedings. pp 315–323
He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 770–778
Hinton GE, Srivastava N, Krizhevsky A, Sutskever I, Salakhutdinov RR (2012) Improving neural networks by preventing co-adaptation of feature detectors. arXiv preprint arXiv:1207.0580
Housley R (2004) A 224-bit one-way hash function: Sha-224. Technical report
Huang G, Liu Z, Van Der Maaten L, Weinberger KQ (2017) Densely connected convolutional networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 4700–4708
Ioffe S (2017) Batch renormalization: Towards reducing minibatch dependence in batch-normalized models. Advances in neural information processing systems 30
Jiang J, Han F, Ling Q, Wang J, Li T, Han H (2020) Efficient network architecture search via multiobjective particle swarm optimization based on decomposition. Neural Netw 123:305–316
Article Google Scholar
Jin Y, Sendhoff B (2008) Pareto-based multiobjective machine learning: an overview and case studies. IEEE Trans Syst Man Cybern Part C (Appl Rev) 38(3):397–415
Article Google Scholar
Johnson F, Valderrama A, Valle C, Crawford B, Soto R, Nanculef R (2020) Automating configuration of convolutional neural network hyperparameters using genetic algorithm. IEEE Access 8:156139–156152
Article Google Scholar
Junior FEF, Yen GG (2019) Particle swarm optimization of deep neural networks architectures for image classification. Swarm Evol Comput 49:62–74
Article Google Scholar
Kim Y-H, Reddy B, Yun S, Seo C (2017) Nemo: Neuro-evolution with multiobjective optimization of deep neural network for speed and accuracy. In: ICML 2017 AutoML Workshop, pp 1–8
Krizhevsky A, Hinton G et al (2009) Learning multiple layers of features from tiny images
Krizhevsky A, Sutskever I, Hinton GE (2012) Imagenet classification with deep convolutional neural networks. Advances in neural information processing systems 25
Krizhevsky A, Sutskever I, Hinton GE (2017) Imagenet classification with deep convolutional neural networks. Commun ACM 60(6):84–90
Article Google Scholar
LeCun Y, Bottou L, Bengio Y, Haffner P (1998) Gradient-based learning applied to document recognition. Proc IEEE 86(11):2278–2324
Article Google Scholar
Li L, Qin L, Qu X, Zhang J, Wang Y, Ran B (2019) Day-ahead traffic flow forecasting based on a deep belief network optimized by the multi-objective particle swarm algorithm. Knowl-Based Syst 172:1–14
Article Google Scholar
Liu H, Simonyan K, Yang Y (2018a) Darts: Differentiable architecture search. arXiv preprint arXiv:1806.09055
Liu C, Zoph B, Neumann M, Shlens J, Hua W, Li L-J, Fei-Fei L, Yuille A, Huang J, Murphy K (2018b) Progressive neural architecture search. In: Proceedings of the European Conference on Computer Vision (ECCV), pp 19–34
Lorenzo PR, Nalepa J, Kawulok M, Ramos LS, Pastor JR (2017) Particle swarm optimization for hyper-parameter selection in deep neural networks. In: Proceedings of the Genetic and Evolutionary Computation Conference, pp 481–488
Lu Z, Whalen I, Boddeti V, Dhebar Y, Deb K, Goodman E, Banzhaf W (2019) Nsga-net: neural architecture search using multi-objective genetic algorithm. In: Proceedings of the Genetic and Evolutionary Computation Conference, pp 419–427
Miller BL, Goldberg DE et al (1995) Genetic algorithms, tournament selection, and the effects of noise. Complex Syst 9(3):193–212
MathSciNet Google Scholar
Netzer Y, Wang T, Coates A, Bissacco A, Wu B, Ng AY (2011) Reading digits in natural images with unsupervised feature learning
Sandler M, Howard A, Zhu M, Zhmoginov A, Chen L-C (2018) Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 4510–4520
Senhaji K, Ramchoun H, Ettaouil M (2020) Training feedforward neural network via multiobjective optimization model using non-smooth l1/2 regularization. Neurocomputing 410:1–11
Article Google Scholar
Sermanet P, Chintala S, LeCun Y (2012) Convolutional neural networks applied to house numbers digit classification. In: Proceedings of the 21st International Conference on Pattern Recognition (ICPR2012). IEEE. pp 3288–3291
Simonyan K, Zisserman A (2014) Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556
Srinivas M, Patnaik LM (1994) Genetic algorithms: a survey. Computer 27(6):17–26
Article Google Scholar
Sun Y, Xue B, Zhang M, Yen GG (2019) Evolving deep convolutional neural networks for image classification. IEEE Trans Evol Comput 24(2):394–407
Article Google Scholar
Szegedy C, Liu W, Jia Y, Sermanet P, Reed S, Anguelov D, Erhan D, Vanhoucke V, Rabinovich A (2015) Going deeper with convolutions. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 1–9
Tan M, Chen B, Pang R, Vasudevan V, Sandler M, Howard A, Le QV (2019) Mnasnet: Platform-aware neural architecture search for mobile. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 2820–2828
Wang B, Sun Y, Xue B, Zhang M (2018) Evolving deep convolutional neural networks by variable-length particle swarm optimization for image classification. In: 2018 IEEE Congress on Evolutionary Computation (CEC). IEEE, pp 1–8
Wan L, Zeiler M, Zhang S, Le Cun Y, Fergus R (2013) Regularization of neural networks using dropconnect. In: International Conference on Machine Learning. PLMR. pp 1058–1066
Wu T, Shi J, Zhou D, Lei Y, Gong M (2019) A multi-objective particle swarm optimization for neural networks pruning. In: 2019 IEEE Congress on Evolutionary Computation (CEC). IEEE. pp 570–577
Xiao H, Rasul K, Vollgraf R (2017) Fashion-mnist: a novel image dataset for benchmarking machine learning algorithms. arXiv preprint arXiv:1708.07747
Xie L, Yuille A (2017) Genetic cnn. In: Proceedings of the IEEE International Conference on Computer Vision, pp 1379–1388
Yeh W-C, Lin Y-P, Liang Y-C, Lai C-M, Huang C-L (2023) Simplified swarm optimization for hyperparameters of convolutional neural networks. Comput Ind Eng 177:109076
Article Google Scholar
Zeiler MD, Fergus R (2013) Stochastic pooling for regularization of deep convolutional neural networks. arXiv preprint arXiv:1301.3557
Zoph B, Vasudevan V, Shlens J, Le QV (2018) Learning transferable architectures for scalable image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 8697–8710

Download references

Author information

Authors and Affiliations

Laboratory of Mathematical Modeling, Simulation and Smart Systems (L2M3S), National School of Arts and Crafts, Moulay Ismail University, Meknes, Morocco
Khalid Elghazi & Tawfik Masrour
Laboratory of Mathematical Modeling, Simulation and Smart Systems (L2M3S), National School of Business and Management, Moulay Ismail University, Meknes, Morocco
Hassan Ramchoun
Mathematics, Computer Science and Engineering Department, University of Quebec at Rimouski, Rimouski, Canada
Tawfik Masrour

Authors

Khalid Elghazi
View author publications
You can also search for this author inPubMed Google Scholar
Hassan Ramchoun
View author publications
You can also search for this author inPubMed Google Scholar
Tawfik Masrour
View author publications
You can also search for this author inPubMed Google Scholar

Corresponding author

Correspondence to Hassan Ramchoun.

Ethics declarations

Conflict of interest

The authors declare that they have no conflicts of interest.

Ethical approval

This article does not contain any studies with human participants or animals performed by any of the authors.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Below is the link to the electronic supplementary material.

Supplementary file1 (IPYNV 40 KB)

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Elghazi, K., Ramchoun, H. & Masrour, T. Enhancing CNN structure and learning through NSGA-II-based multi-objective optimization. Evolving Systems 15, 1503–1519 (2024). https://doi.org/10.1007/s12530-024-09574-9

Download citation

Received: 12 September 2023
Accepted: 14 February 2024
Published: 01 April 2024
Issue Date: August 2024
DOI: https://doi.org/10.1007/s12530-024-09574-9

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Enhancing CNN structure and learning through NSGA-II-based multi-objective optimization

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

A New Multi-objective Optimization Model for Optimal Configuration of CNNs

Convolutional Neural Networks: Architecture Optimization and Regularization

A Two-Stage Efficient Evolutionary Neural Architecture Search Method for Image Classification

Explore related subjects

Data availability

References

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Ethical approval

Additional information

Publisher's Note

Supplementary Information

Supplementary file1 (IPYNV 40 KB)

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now