Skip to main content

Advertisement

Log in

Deep neural network hyper-parameter tuning through twofold genetic approach

  • Data analytics and machine learning
  • Published:
Soft Computing Aims and scope Submit manuscript

Abstract

In this paper, traditional and meta-heuristic approaches for optimizing deep neural networks (DNN) have been surveyed, and a genetic algorithm (GA)-based approach involving two optimization phases for hyper-parameter discovery and optimal data subset determination has been proposed. The first phase aims to quickly select an optimal combination of the network hyper-parameters to design a DNN. Compared to the traditional grid-search-based method, the optimal parameters have been computed 6.5 times faster for recurrent neural network (RNN) and 8 times faster for convolutional neural network (CNN). The proposed approach is capable of tuning multiple hyper-parameters simultaneously. The second phase finds an appropriate subset of the training data for near-optimal prediction performance, providing an additional speedup of 75.86% for RNN and 41.12% for CNN over the first phase.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16
Fig. 17
Fig. 18
Fig. 19
Fig. 20

Similar content being viewed by others

Availability of data and material

The data and material are available at github.com/MIntelligence-Group/DNNDualOptiGA.

References

  • Aguilar-Ruiz Jesus S, Riquelme Jose C, Toro Miguel (2003) Evolutionary learning of hierarchical decision rules. IEEE Trans Syst Man Cybernet (SMC) Part B 33(2):324–331

    Article  Google Scholar 

  • Anders Ulrich, Korn Olaf (1999) Model selection in neural networks. Elsevier Neural Netw 12(2):309–323

    Article  Google Scholar 

  • Beheshti Zahra, Shamsuddin Siti Mariyam Hj (2013) A review of population-based meta heuristic algorithms. Int J Adv Soft Comput Appl 5(1):1–35

    Google Scholar 

  • Bello Irwan, Zoph Barret, Vasudevan Vijay, Le Quoc V (2017) Neural Optimizer Search with Reinforcement Learning. In International Conference on Machine Learning (ICML), pages 459–468

  • Bengio Yoshua, Simard Patrice, Frasconi Paolo (1994) Learning long-term dependencies with gradient descent is difficult. IEEE Trans Neural Netw 5(2):157–166

    Article  Google Scholar 

  • Bergstra James S, Bardenet Remi, Bengio Yoshua, Kegl Balazs (2011) Algorithms for Hyper-parameter Optimization. In: Advances in Neural Information Processing Systems (NeuroIPS), pages 2546–2554

  • Bergstra James, Bengio Yoshua (2012) Random search for hyper-parameter optimization. J Mach Learn Res (JMLR) 13(Feb):281–305

    MathSciNet  MATH  Google Scholar 

  • Bullinaria John A, AlYahya Khulood (2014) Artificial bee colony training of neural networks. In Nature Inspired Cooperative Strategies for Optimization (NICSO), pages 191–201. Springer

  • Chung Junyoung, Gulcehre Caglar, Cho Kyunghyun, Bengio Yoshua (2015) Gated feedback recurrent neural networks. In International Conference on Machine Learning (ICML), pages 2067–2075

  • Di Francescomarino Chiara, Dumas Marlon, Federici Marco, Ghidini Chiara, Maggi Fabrizio Maria, Rizzi Williams, Simonetto Luca (2018) Genetic algorithms for hyperparameter optimization in predictive business process monitoring. Elsevier Inf Syst 74:67–83

    Article  Google Scholar 

  • Dobslaw Felix. (2010) A Parameter Tuning Framework for Metaheuristics based on Design of Experiments and Artificial Neural Networks. In International Conference on Computer Mathematics and Natural Computing

  • Elsken Thomas, Metzen Jan Hendrik, Hutter Frank (2019) Neural architecture search: a survey. J Mach Learn Res (JMLR) 20(55):1–21

    MathSciNet  MATH  Google Scholar 

  • Franceschi Luca, Donini Michele, Frasconi Paolo, Pontil Massimiliano (2017) Forward and reverse gradient-based Hyperparameter Optimization. In International Conference on Machine Learning (ICML), pages 1165–1173

  • Frazier Peter I (2018) Bayesian Optimization. In Recent Advances in Optimization and Modeling of Contemporary Problems, pages 255–278

  • Glorot Xavier, Bengio Yoshua (2010) Understanding the Difficulty of Training Deep Feedforward Neural Networks. In International Conference on Artificial Intelligence and Statistics (AISTATS), pages 249–256

  • Glover Fred (1989) Tabu search, Part I. ORSA J Comput 1(3):190–206

    Article  Google Scholar 

  • Goldberg David E, John H (1988) Genetic algorithms and machine learning. Mach Learn 3(2–3):95–99

    Article  Google Scholar 

  • Graves Alex (2013) Generating Sequences with Recurrent Neural Networks. arXiv preprint arXiv:1308.0850

  • Gudise Venu G, Venayagamoorthy Ganesh K (2003) Comparison of particle swarm optimization and backpropagation as training algorithms for neural networks. In IEEE Swarm Intelligence Symposium (SIS), pages 110–117. IEEE

  • Harvey Matt (2017) Let’s evolve a neural network with a genetic algorithm. https://blog.coast.ai/lets-evolve-a-neural-network-with-a-genetic-algorithm-code-included-8809bece164, Accessed on 2020-02-06

  • Hochreiter Sepp, Schmidhuber Jurgen (1997) Long short-term memory. Neural Comput 9(8):1735–1780

    Article  Google Scholar 

  • Ioffe Sergey, Szegedy Christian (2015) Batch normalization: accelerating deep network training by reducing internal covariate shift. arXiv preprint arXiv:1502.03167

  • Johnson A (2017) Common problems in hyperparameter optimization. https://sigopt.com/blog/common-problems-in-hyperparameter-optimization/, Accessed on 2020-03-02

  • Juang Chia-Feng (2004) A hybrid of genetic algorithm and particle swarm optimization for recurrent network design. IEEE Trans Syst Man Cybernet (SMC), Part B 34(2):997–1006

    Article  Google Scholar 

  • Kangkang Sun Lu, Liu Jianbin Qiu, Feng Gang (2020) Fuzzy adaptive finite-time fault-tolerant control for strict-feedback nonlinear systems. IEEE Transactions on Fuzzy Systems

  • Kapanova KG, Dimov I, Sellier JM (2018) A genetic approach to automatic neural network architecture optimization. Springer Neural Comput Appl 29(5):1481–1492

    Article  Google Scholar 

  • Kapanova KG, Dimov I, Sellier JM (2018) A genetic approach to automatic neural network architecture optimization. Springer Neural Comput Appl 29(5):1481–1492

    Article  Google Scholar 

  • Kennedy James (2011) Particle swarm optimization. In Springer Encyclopedia of Machine Learning, pages 760–766. Springer

  • Keras Documentation (2018) Usage of optimizers. https://keras.io/optimizers/,

  • Krizhevsky Alex, Vinod Nair, and Geoffrey Hinton (2020) The CIFAR-10 Dataset. Accessed on 2020-02-06

  • Lam HK, Ling SH, Leung Frank HF, Tam Peter Kwong-Shun (2001) Tuning of the structure and parameters of neural network using an improved genetic algorithm. In IEEE Conference of Industrial Electronics Society, volume 1, pages 25–30

  • Larochelle Hugo, Bengio Yoshua, Louradour Jérôme, Lamblin Pascal (2009) Exploring Strategies for Training Deep Neural Networks. Journal of Machine Learning Research (JMLR) 10(1):

  • LeCun Yann, Cortes Corinna, Burges CJ (2010) MNIST Handwritten Digit Database. AT&T Labs Online. http://yann. lecun. com/exdb/mnist, 2,

  • LeCun Yann, Bengio Yoshua, Hinton Geoffrey (2015) Deep Learning. Nature 521(7553):436–444

    Article  Google Scholar 

  • Li Ke, Malik Jitendra (2017) Learning to Optimize Neural Nets. arXiv preprint arXiv:1703.00441,

  • Li Lisha, Jamieson Kevin, DeSalvo Giulia, Rostamizadeh Afshin, Talwalkar Ameet (2017) Hyperband: a novel bandit-based approach to hyperparameter optimization. J Mach Learn Res (JMLR) 18(1):6765–6816

    MathSciNet  MATH  Google Scholar 

  • Liu Mengchen, Shi Jiaxin, Li Zhen, Li Chongxuan, Zhu Jun, Liu Shixia (2017) Towards better analysis of deep convolutional neural networks. IEEE Trans Visual Comput Graph 23(1):91–100

    Article  Google Scholar 

  • Lu Zhiyun, Chiang Chao-Kai, Sha Fei (2019) Hyper-parameter tuning under a budget constraint

  • Maclaurin Dougal, Duvenaud David, Adams Ryan (2015) Gradient-based Hyperparameter Optimization through Reversible Learning. In International Conference on Machine Learning (ICML), pages 2113–2122

  • Man Kim-Fung, Tang Kit-Sang, Kwong Sam (1996) Genetic algorithms: concepts and applications in engineering design. IEEE Trans Ind Electron 43(5):519–534

    Article  Google Scholar 

  • Nasrabadi Nasser M (2007) Pattern recognition and machine learning. J Electron Imaging 16(4):049901

    Article  MathSciNet  Google Scholar 

  • Ojha Varun Kumar, Abraham Ajith, Snavsel Vaclav (2017) Metaheuristic design of feedforward neural networks: a review of two decades of research. Elsevier Eng Appl Artif Intell 60:97–116

    Article  Google Scholar 

  • Pasa Luca, Sperduti Alessandro (2014) Pre-training of recurrent neural networks via linear autoencoders. In Advances in Neural Information Processing Systems (NeuroIPS), pages 3572–3580

  • Pascanu Razvan, Mikolov Tomas, Bengio Yoshua (2013) On the Difficulty of Training Recurrent Neural Networks. In International Conference on Machine Learning (ICML), pages 1310–1318

  • Prechelt Lutz (1998) Early stopping, but when? In Springer Neural Networks: Tricks of the Trade, pages 55–69

  • Qiu Jianbin, Sun Kangkang, Wang Tong, Gao Huijun (2019) Observer-based fuzzy adaptive event-triggered control for pure-feedback nonlinear systems with prescribed performance. IEEE Trans Fuzzy Syst 27(11):2152–2162

    Article  Google Scholar 

  • Qiu Jianbin, Sun Kangkang, Rudas Imre J, Gao Huijun (2019) Command filter-based adaptive nn control for mimo nonlinear systems with full-state constraints and actuator hysteresis. IEEE Trans Cybernet 50(7):2905–2915

    Article  Google Scholar 

  • Real, Esteban, Moore, Sherry, Selle, Andrew, Saxena, Saurabh, Suematsu, Yutaka Leon, Tan, Jie, Le, Quoc V, Kurakin Alexey (2017) Large scale evolution of image classifiers. In International Conference on Machine Learning (ICML), pages 2902–2911. JMLR.org

  • Rere LM, Fanany Mohamad Ivan, Arymurthy Aniati Murni (2016) Metaheuristic algorithms for convolution neural network. Computational Intelligence and Neuroscience, 2016

  • Rere LM Rasdi, Fanany Mohamad Ivan, Arymurthy Aniati Murni (2016) Metaheuristic algorithms for convolution neural network. arXiv preprint arXiv:1610.01925

  • Saeed Aaqib (2017) Using genetic algorithm for optimizing recurrent neural network. http://aqibsaeed.github.io/2017-08-11-genetic-algorithm-for-optimizing-rnn/, Accessed on 2020-02-06

  • Schmidhuber Jurgen (2015) Deep learning in neural networks: an overview. Elsevier Neural Netw 61:85–117

    Article  Google Scholar 

  • Schumer M, Steiglitz Kenneth (1968) Adaptive step size random search. IEEE Trans Autom Control 13(3):270–276

    Article  Google Scholar 

  • Stamoulis Dimitrios, Cai Ermao, Juan Da-Cheng, Marculescu Diana (2018) Hyperpower: power-and memory-constrained hyper-parameter optimization for neural networks. In Design, Automation & Test in Europe Conference & Exhibition (DATE), pages 19–24. IEEE,

  • Stewart Lawrence, Stalzer Mark (2017) Bayesian Optimization for Parameter Tuning of the XOR Neural Network. arXiv preprint arXiv:1709.07842

  • Such Felipe Petroski, Madhavan Vashisht, Conti Edoardo, Lehman Joel, Stanley Kenneth O, Clune Jeff (2017) Deep Neuroevolution: Genetic Algorithms are a Competitive Alternative for Training Deep Neural Networks for Reinforcement Learning. arXiv preprint arXiv:1712.06567

  • Suganuma Masanori, Shirakawa Shinichi, Nagao Tomoharu (2017) A genetic programming approach to designing convolutional neural network architectures. In ACM Genetic and Evolutionary Computation Conference (GECCO), pages 497–504. ACM

  • Sun Yanan, Xue Bing, Zhang Mengjie, Yen Gary G (2018) Automatically designing CNN architectures using genetic algorithm for image classification. arXiv preprint arXiv:1808.03818,

  • Sun Yanan, Yen Gary G, Yi Zhang (2018) Evolving unsupervised deep neural networks for learning meaningful representations. IEEE Trans Evolut Comput 23(1):89–103

    Article  Google Scholar 

  • Szegedy Christian, Liu Wei, Jia Yangqing, Sermanet Pierre, Reed Scott, Anguelov Dragomir, Erhan Dumitru, Vanhoucke Vincent, Rabinovich Andrew (2015) Going Deeper with Convolutions. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 1–9

  • Thomas M (1997) Mitchell Machine Learning., 1st edn. McGraw-Hill Inc, New York, NY, USA

    Google Scholar 

  • Wanjawa Barack Wamkaya, Muchemi Lawrence (2014) ANN model to predict stock prices at stock exchange markets. arXiv preprint arXiv:1502.06434

  • Wicaksono Ananto Setyo, Supianto Ahmad Afif (2018) Hyperparameter Optimization Using Genetic Algorithm on Machine Learning Methods for Online News Popularity Prediction. Int J Adv Comput Sci Appl 9(12):263–267

    Google Scholar 

  • Wolpert David H, Macready William G (1997) No free lLunch theorems for optimization. IEEE Trans Evolut Comput 1(1):67–82

    Article  Google Scholar 

  • Xia Yufei, Liu Chuanzhe, Li YuYing, Liu Nana (2017) A boosted decision tree approach using bayesian hyper-parameter optimization for credit scoring. Elsevier Exp Syst Appl 78:225–241

    Article  Google Scholar 

  • Xiao Xueli, Yan Ming, Basodi Sunitha, Ji Chunyan, Pan Yi (2020) Efficient hyperparameter optimization in Deep Learning Using a Variable Length Genetic Algorithm. arXiv preprint arXiv:2006.12703,

  • Young Steven R, Rose Derek C, Karnowski Thomas P, Lim Seung-Hwan, Patton Robert M (2015) Optimizing deep learning hyper-parameters through an evolutionary algorithm. In Workshop on Machine Learning in High-Performance Computing Environments, pages 1–5

  • Zhang Guoqiang Peter (2000) Neural networks for classification: a survey. IEEE Trans Syst Man Cybernet (SMC), Part C Appl Rev 30(4):451–462

    Article  Google Scholar 

  • Zoph Barret, Le Quoc V (2016) Neural Architecture Search with Reinforcement Learning. arXiv preprint arXiv:1611.01578

Download references

Acknowledgements

The authors would like to thank the editors and anonymous reviewers whose valuable comments have helped to improve the presentation of the paper.

Funding

This research was supported by Ministry of Human Resource Development (MHRD), India, with reference grant number: 1-3146198040.

Author information

Authors and Affiliations

Authors

Contributions

Puneet Kumar is the principal and corresponding author who has performed this research work during his Ph.D. at Indian Institute of Technology, Roorkee, India. All the authors contributed to the study conception, design, and technical writing. Implementation, data collection, and results analysis were performed by Puneet Kumar. All authors read and approved the final manuscript.

Corresponding author

Correspondence to Puneet Kumar.

Ethics declarations

Conflicts of interest

Authors have no conflict of interest.

Code availability

The code is available at github.com/MIntelligence-Group/DNNDualOptiGA

Ethics approval

‘Not applicable’.

Consent to participate

‘Not applicable’.

Consent for publication

‘Not applicable’.

Human participants

This article does not contain any studies with human participants or animals performed by any of the authors

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Kumar, P., Batra, S. & Raman, B. Deep neural network hyper-parameter tuning through twofold genetic approach. Soft Comput 25, 8747–8771 (2021). https://doi.org/10.1007/s00500-021-05770-w

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00500-021-05770-w

Keywords

Navigation