Deep neural network hyper-parameter tuning through twofold genetic approach

Kumar, Puneet; Batra, Shalini; Raman, Balasubramanian

doi:10.1007/s00500-021-05770-w

Deep neural network hyper-parameter tuning through twofold genetic approach

Data analytics and machine learning
Published: 18 April 2021

Volume 25, pages 8747–8771, (2021)
Cite this article

Soft Computing Aims and scope Submit manuscript

1369 Accesses
22 Citations
Explore all metrics

Abstract

In this paper, traditional and meta-heuristic approaches for optimizing deep neural networks (DNN) have been surveyed, and a genetic algorithm (GA)-based approach involving two optimization phases for hyper-parameter discovery and optimal data subset determination has been proposed. The first phase aims to quickly select an optimal combination of the network hyper-parameters to design a DNN. Compared to the traditional grid-search-based method, the optimal parameters have been computed 6.5 times faster for recurrent neural network (RNN) and 8 times faster for convolutional neural network (CNN). The proposed approach is capable of tuning multiple hyper-parameters simultaneously. The second phase finds an appropriate subset of the training data for near-optimal prediction performance, providing an additional speedup of 75.86% for RNN and 41.12% for CNN over the first phase.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 3

Convolutional Neural Networks Hyperparameters Optimization Using Sine Cosine Algorithm

Novel hybrid success history intelligent optimizer with Gaussian transformation: application in CNN hyperparameter tuning

Article 06 November 2023

Hussam N. Fakhouri, Sadi Alawadi, … Faten Hamad

A comprehensive survey on optimizing deep learning models by metaheuristics

Article 31 March 2021

Bahriye Akay, Dervis Karaboga & Rustu Akay

Availability of data and material

The data and material are available at github.com/MIntelligence-Group/DNNDualOptiGA.

References

Aguilar-Ruiz Jesus S, Riquelme Jose C, Toro Miguel (2003) Evolutionary learning of hierarchical decision rules. IEEE Trans Syst Man Cybernet (SMC) Part B 33(2):324–331
Article Google Scholar
Anders Ulrich, Korn Olaf (1999) Model selection in neural networks. Elsevier Neural Netw 12(2):309–323
Article Google Scholar
Beheshti Zahra, Shamsuddin Siti Mariyam Hj (2013) A review of population-based meta heuristic algorithms. Int J Adv Soft Comput Appl 5(1):1–35
Google Scholar
Bello Irwan, Zoph Barret, Vasudevan Vijay, Le Quoc V (2017) Neural Optimizer Search with Reinforcement Learning. In International Conference on Machine Learning (ICML), pages 459–468
Bengio Yoshua, Simard Patrice, Frasconi Paolo (1994) Learning long-term dependencies with gradient descent is difficult. IEEE Trans Neural Netw 5(2):157–166
Article Google Scholar
Bergstra James S, Bardenet Remi, Bengio Yoshua, Kegl Balazs (2011) Algorithms for Hyper-parameter Optimization. In: Advances in Neural Information Processing Systems (NeuroIPS), pages 2546–2554
Bergstra James, Bengio Yoshua (2012) Random search for hyper-parameter optimization. J Mach Learn Res (JMLR) 13(Feb):281–305
MathSciNet MATH Google Scholar
Bullinaria John A, AlYahya Khulood (2014) Artificial bee colony training of neural networks. In Nature Inspired Cooperative Strategies for Optimization (NICSO), pages 191–201. Springer
Chung Junyoung, Gulcehre Caglar, Cho Kyunghyun, Bengio Yoshua (2015) Gated feedback recurrent neural networks. In International Conference on Machine Learning (ICML), pages 2067–2075
Di Francescomarino Chiara, Dumas Marlon, Federici Marco, Ghidini Chiara, Maggi Fabrizio Maria, Rizzi Williams, Simonetto Luca (2018) Genetic algorithms for hyperparameter optimization in predictive business process monitoring. Elsevier Inf Syst 74:67–83
Article Google Scholar
Dobslaw Felix. (2010) A Parameter Tuning Framework for Metaheuristics based on Design of Experiments and Artificial Neural Networks. In International Conference on Computer Mathematics and Natural Computing
Elsken Thomas, Metzen Jan Hendrik, Hutter Frank (2019) Neural architecture search: a survey. J Mach Learn Res (JMLR) 20(55):1–21
MathSciNet MATH Google Scholar
Franceschi Luca, Donini Michele, Frasconi Paolo, Pontil Massimiliano (2017) Forward and reverse gradient-based Hyperparameter Optimization. In International Conference on Machine Learning (ICML), pages 1165–1173
Frazier Peter I (2018) Bayesian Optimization. In Recent Advances in Optimization and Modeling of Contemporary Problems, pages 255–278
Glorot Xavier, Bengio Yoshua (2010) Understanding the Difficulty of Training Deep Feedforward Neural Networks. In International Conference on Artificial Intelligence and Statistics (AISTATS), pages 249–256
Glover Fred (1989) Tabu search, Part I. ORSA J Comput 1(3):190–206
Article Google Scholar
Goldberg David E, John H (1988) Genetic algorithms and machine learning. Mach Learn 3(2–3):95–99
Article Google Scholar
Graves Alex (2013) Generating Sequences with Recurrent Neural Networks. arXiv preprint arXiv:1308.0850
Gudise Venu G, Venayagamoorthy Ganesh K (2003) Comparison of particle swarm optimization and backpropagation as training algorithms for neural networks. In IEEE Swarm Intelligence Symposium (SIS), pages 110–117. IEEE
Harvey Matt (2017) Let’s evolve a neural network with a genetic algorithm. https://blog.coast.ai/lets-evolve-a-neural-network-with-a-genetic-algorithm-code-included-8809bece164, Accessed on 2020-02-06
Hochreiter Sepp, Schmidhuber Jurgen (1997) Long short-term memory. Neural Comput 9(8):1735–1780
Article Google Scholar
Ioffe Sergey, Szegedy Christian (2015) Batch normalization: accelerating deep network training by reducing internal covariate shift. arXiv preprint arXiv:1502.03167
Johnson A (2017) Common problems in hyperparameter optimization. https://sigopt.com/blog/common-problems-in-hyperparameter-optimization/, Accessed on 2020-03-02
Juang Chia-Feng (2004) A hybrid of genetic algorithm and particle swarm optimization for recurrent network design. IEEE Trans Syst Man Cybernet (SMC), Part B 34(2):997–1006
Article Google Scholar
Kangkang Sun Lu, Liu Jianbin Qiu, Feng Gang (2020) Fuzzy adaptive finite-time fault-tolerant control for strict-feedback nonlinear systems. IEEE Transactions on Fuzzy Systems
Kapanova KG, Dimov I, Sellier JM (2018) A genetic approach to automatic neural network architecture optimization. Springer Neural Comput Appl 29(5):1481–1492
Article Google Scholar
Kapanova KG, Dimov I, Sellier JM (2018) A genetic approach to automatic neural network architecture optimization. Springer Neural Comput Appl 29(5):1481–1492
Article Google Scholar
Kennedy James (2011) Particle swarm optimization. In Springer Encyclopedia of Machine Learning, pages 760–766. Springer
Keras Documentation (2018) Usage of optimizers. https://keras.io/optimizers/,
Krizhevsky Alex, Vinod Nair, and Geoffrey Hinton (2020) The CIFAR-10 Dataset. Accessed on 2020-02-06
Lam HK, Ling SH, Leung Frank HF, Tam Peter Kwong-Shun (2001) Tuning of the structure and parameters of neural network using an improved genetic algorithm. In IEEE Conference of Industrial Electronics Society, volume 1, pages 25–30
Larochelle Hugo, Bengio Yoshua, Louradour Jérôme, Lamblin Pascal (2009) Exploring Strategies for Training Deep Neural Networks. Journal of Machine Learning Research (JMLR) 10(1):
LeCun Yann, Cortes Corinna, Burges CJ (2010) MNIST Handwritten Digit Database. AT&T Labs Online. http://yann. lecun. com/exdb/mnist, 2,
LeCun Yann, Bengio Yoshua, Hinton Geoffrey (2015) Deep Learning. Nature 521(7553):436–444
Article Google Scholar
Li Ke, Malik Jitendra (2017) Learning to Optimize Neural Nets. arXiv preprint arXiv:1703.00441,
Li Lisha, Jamieson Kevin, DeSalvo Giulia, Rostamizadeh Afshin, Talwalkar Ameet (2017) Hyperband: a novel bandit-based approach to hyperparameter optimization. J Mach Learn Res (JMLR) 18(1):6765–6816
MathSciNet MATH Google Scholar
Liu Mengchen, Shi Jiaxin, Li Zhen, Li Chongxuan, Zhu Jun, Liu Shixia (2017) Towards better analysis of deep convolutional neural networks. IEEE Trans Visual Comput Graph 23(1):91–100
Article Google Scholar
Lu Zhiyun, Chiang Chao-Kai, Sha Fei (2019) Hyper-parameter tuning under a budget constraint
Maclaurin Dougal, Duvenaud David, Adams Ryan (2015) Gradient-based Hyperparameter Optimization through Reversible Learning. In International Conference on Machine Learning (ICML), pages 2113–2122
Man Kim-Fung, Tang Kit-Sang, Kwong Sam (1996) Genetic algorithms: concepts and applications in engineering design. IEEE Trans Ind Electron 43(5):519–534
Article Google Scholar
Nasrabadi Nasser M (2007) Pattern recognition and machine learning. J Electron Imaging 16(4):049901
Article MathSciNet Google Scholar
Ojha Varun Kumar, Abraham Ajith, Snavsel Vaclav (2017) Metaheuristic design of feedforward neural networks: a review of two decades of research. Elsevier Eng Appl Artif Intell 60:97–116
Article Google Scholar
Pasa Luca, Sperduti Alessandro (2014) Pre-training of recurrent neural networks via linear autoencoders. In Advances in Neural Information Processing Systems (NeuroIPS), pages 3572–3580
Pascanu Razvan, Mikolov Tomas, Bengio Yoshua (2013) On the Difficulty of Training Recurrent Neural Networks. In International Conference on Machine Learning (ICML), pages 1310–1318
Prechelt Lutz (1998) Early stopping, but when? In Springer Neural Networks: Tricks of the Trade, pages 55–69
Qiu Jianbin, Sun Kangkang, Wang Tong, Gao Huijun (2019) Observer-based fuzzy adaptive event-triggered control for pure-feedback nonlinear systems with prescribed performance. IEEE Trans Fuzzy Syst 27(11):2152–2162
Article Google Scholar
Qiu Jianbin, Sun Kangkang, Rudas Imre J, Gao Huijun (2019) Command filter-based adaptive nn control for mimo nonlinear systems with full-state constraints and actuator hysteresis. IEEE Trans Cybernet 50(7):2905–2915
Article Google Scholar
Real, Esteban, Moore, Sherry, Selle, Andrew, Saxena, Saurabh, Suematsu, Yutaka Leon, Tan, Jie, Le, Quoc V, Kurakin Alexey (2017) Large scale evolution of image classifiers. In International Conference on Machine Learning (ICML), pages 2902–2911. JMLR.org
Rere LM, Fanany Mohamad Ivan, Arymurthy Aniati Murni (2016) Metaheuristic algorithms for convolution neural network. Computational Intelligence and Neuroscience, 2016
Rere LM Rasdi, Fanany Mohamad Ivan, Arymurthy Aniati Murni (2016) Metaheuristic algorithms for convolution neural network. arXiv preprint arXiv:1610.01925
Saeed Aaqib (2017) Using genetic algorithm for optimizing recurrent neural network. http://aqibsaeed.github.io/2017-08-11-genetic-algorithm-for-optimizing-rnn/, Accessed on 2020-02-06
Schmidhuber Jurgen (2015) Deep learning in neural networks: an overview. Elsevier Neural Netw 61:85–117
Article Google Scholar
Schumer M, Steiglitz Kenneth (1968) Adaptive step size random search. IEEE Trans Autom Control 13(3):270–276
Article Google Scholar
Stamoulis Dimitrios, Cai Ermao, Juan Da-Cheng, Marculescu Diana (2018) Hyperpower: power-and memory-constrained hyper-parameter optimization for neural networks. In Design, Automation & Test in Europe Conference & Exhibition (DATE), pages 19–24. IEEE,
Stewart Lawrence, Stalzer Mark (2017) Bayesian Optimization for Parameter Tuning of the XOR Neural Network. arXiv preprint arXiv:1709.07842
Such Felipe Petroski, Madhavan Vashisht, Conti Edoardo, Lehman Joel, Stanley Kenneth O, Clune Jeff (2017) Deep Neuroevolution: Genetic Algorithms are a Competitive Alternative for Training Deep Neural Networks for Reinforcement Learning. arXiv preprint arXiv:1712.06567
Suganuma Masanori, Shirakawa Shinichi, Nagao Tomoharu (2017) A genetic programming approach to designing convolutional neural network architectures. In ACM Genetic and Evolutionary Computation Conference (GECCO), pages 497–504. ACM
Sun Yanan, Xue Bing, Zhang Mengjie, Yen Gary G (2018) Automatically designing CNN architectures using genetic algorithm for image classification. arXiv preprint arXiv:1808.03818,
Sun Yanan, Yen Gary G, Yi Zhang (2018) Evolving unsupervised deep neural networks for learning meaningful representations. IEEE Trans Evolut Comput 23(1):89–103
Article Google Scholar
Szegedy Christian, Liu Wei, Jia Yangqing, Sermanet Pierre, Reed Scott, Anguelov Dragomir, Erhan Dumitru, Vanhoucke Vincent, Rabinovich Andrew (2015) Going Deeper with Convolutions. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 1–9
Thomas M (1997) Mitchell Machine Learning., 1st edn. McGraw-Hill Inc, New York, NY, USA
Google Scholar
Wanjawa Barack Wamkaya, Muchemi Lawrence (2014) ANN model to predict stock prices at stock exchange markets. arXiv preprint arXiv:1502.06434
Wicaksono Ananto Setyo, Supianto Ahmad Afif (2018) Hyperparameter Optimization Using Genetic Algorithm on Machine Learning Methods for Online News Popularity Prediction. Int J Adv Comput Sci Appl 9(12):263–267
Google Scholar
Wolpert David H, Macready William G (1997) No free lLunch theorems for optimization. IEEE Trans Evolut Comput 1(1):67–82
Article Google Scholar
Xia Yufei, Liu Chuanzhe, Li YuYing, Liu Nana (2017) A boosted decision tree approach using bayesian hyper-parameter optimization for credit scoring. Elsevier Exp Syst Appl 78:225–241
Article Google Scholar
Xiao Xueli, Yan Ming, Basodi Sunitha, Ji Chunyan, Pan Yi (2020) Efficient hyperparameter optimization in Deep Learning Using a Variable Length Genetic Algorithm. arXiv preprint arXiv:2006.12703,
Young Steven R, Rose Derek C, Karnowski Thomas P, Lim Seung-Hwan, Patton Robert M (2015) Optimizing deep learning hyper-parameters through an evolutionary algorithm. In Workshop on Machine Learning in High-Performance Computing Environments, pages 1–5
Zhang Guoqiang Peter (2000) Neural networks for classification: a survey. IEEE Trans Syst Man Cybernet (SMC), Part C Appl Rev 30(4):451–462
Article Google Scholar
Zoph Barret, Le Quoc V (2016) Neural Architecture Search with Reinforcement Learning. arXiv preprint arXiv:1611.01578

Download references

Acknowledgements

The authors would like to thank the editors and anonymous reviewers whose valuable comments have helped to improve the presentation of the paper.

Funding

This research was supported by Ministry of Human Resource Development (MHRD), India, with reference grant number: 1-3146198040.

Author information

Authors and Affiliations

Indian Institute of Technology, Roorkee, Uttrakhand, 247667, India
Puneet Kumar & Balasubramanian Raman
Thapar Institute of Engineering and Technology, Patiala, Punjab, 147001, India
Shalini Batra

Authors

Puneet Kumar
View author publications
You can also search for this author in PubMed Google Scholar
Shalini Batra
View author publications
You can also search for this author in PubMed Google Scholar
Balasubramanian Raman
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

Puneet Kumar is the principal and corresponding author who has performed this research work during his Ph.D. at Indian Institute of Technology, Roorkee, India. All the authors contributed to the study conception, design, and technical writing. Implementation, data collection, and results analysis were performed by Puneet Kumar. All authors read and approved the final manuscript.

Corresponding author

Correspondence to Puneet Kumar.

Ethics declarations

Conflicts of interest

Authors have no conflict of interest.

Code availability

The code is available at github.com/MIntelligence-Group/DNNDualOptiGA

Ethics approval

‘Not applicable’.

Consent to participate

‘Not applicable’.

Consent for publication

‘Not applicable’.

Human participants

This article does not contain any studies with human participants or animals performed by any of the authors

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Kumar, P., Batra, S. & Raman, B. Deep neural network hyper-parameter tuning through twofold genetic approach. Soft Comput 25, 8747–8771 (2021). https://doi.org/10.1007/s00500-021-05770-w

Download citation

Accepted: 23 March 2021
Published: 18 April 2021
Issue Date: July 2021
DOI: https://doi.org/10.1007/s00500-021-05770-w

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Deep neural network hyper-parameter tuning through twofold genetic approach

Abstract

Access this article

Similar content being viewed by others

Convolutional Neural Networks Hyperparameters Optimization Using Sine Cosine Algorithm

Novel hybrid success history intelligent optimizer with Gaussian transformation: application in CNN hyperparameter tuning

A comprehensive survey on optimizing deep learning models by metaheuristics

Availability of data and material

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Conflicts of interest

Code availability

Ethics approval

Consent to participate

Consent for publication

Human participants

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Deep neural network hyper-parameter tuning through twofold genetic approach

Abstract

Access this article

Similar content being viewed by others

Convolutional Neural Networks Hyperparameters Optimization Using Sine Cosine Algorithm

Novel hybrid success history intelligent optimizer with Gaussian transformation: application in CNN hyperparameter tuning

A comprehensive survey on optimizing deep learning models by metaheuristics

Availability of data and material

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Conflicts of interest

Code availability

Ethics approval

Consent to participate

Consent for publication

Human participants

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation