research-article

Public Access

Co-evolving Recurrent Neural Networks and their Hyperparameters with Simplex Hyperparameter Optimization

Authors:

Amit Dilip Kini,

Swaraj Sambhaji Yadav,

Aditya Shankar Thakur,

Akshar Bajrang Awari,

Travis DesellAuthors Info & Claims

GECCO '23 Companion: Proceedings of the Companion Conference on Genetic and Evolutionary Computation

Pages 1639 - 1647

https://doi.org/10.1145/3583133.3596407

Published: 24 July 2023 Publication History

Abstract

Designing machine learning models involves determining not only the network architecture, but also non-architectural elements such as training hyperparameters. Further confounding this problem, different architectures and datasets will perform more optimally with different hyperparameters. This problem is exacerbated for neuroevolution (NE) and neural architecture search (NAS) algorithms, which can generate and train architectures with a wide variety of architectures in order to find optimal architectures. In such algorithms, if hyperparameters are fixed, then suboptimal architectures can be found as they will be biased towards the fixed parameters. This paper evaluates the use of the simplex hyperparameter optimization (SHO) method, which allows co-evolution of hyperparameters over the course of a NE algorithm, allowing the NE algorithm to simultaneously optimize both network architectures and hyperparameters. SHO has been previously shown to be able to optimize hyperparameters for convolutional neural networks using traditional stochastic gradient descent with Nesterov momentum, and this work extends on this to evaluate SHO for evolving recurrent neural networks with additional modern weight optimizers such as RMSProp and Adam. Results show that incorporating SHO into the neuroevolution process not only enables finding better performing architectures but also faster convergence to optimal architectures across all datasets and optimization methods tested.

References

[1]

Abdullah Alanazi. 2022. Using machine learning for healthcare challenges and opportunities. Informatics in Medicine Unlocked (2022), 100924.

[2]

Hussain Alibrahim and Simone A Ludwig. 2021. Hyperparameter optimization: Comparing genetic algorithm against grid search and bayesian optimization. In 2021 IEEE Congress on Evolutionary Computation (CEC). IEEE, 1551--1559.

Digital Library

[3]

Moriah Ariely, Tanya Nazaretsky, and Giora Alexandron. 2022. Machine learning and Hebrew NLP for automated assessment of open-ended questions in biology. International journal of artificial intelligence in education (2022), 1--34.

[4]

Habeeb Balogun, Hafiz Alaka, and Christian Nnaemeka Egwim. 2021. Boruta-grid-search least square support vector machine for NO2 pollution prediction using big data analytics and IoT emission sensors. Applied Computing and Informatics ahead-of-print (2021).

[5]

Thomas Bartz-Beielstein, Jürgen Branke, Jörn Mehnen, and Olaf Mersmann. 2014. Evolutionary algorithms. Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery 4, 3 (2014), 178--195.

Digital Library

[6]

Dilyara Baymurzina, Eugene Golikov, and Mikhail Burtsev. 2022. A review of neural architecture search. Neurocomputing 474 (2022), 82--93.

Digital Library

[7]

Daniel Mesafint Belete and Manjaiah D Huchaiah. 2022. Grid search in hyperparameter optimization of machine learning models for prediction of HIV/AIDS test results. International Journal of Computers and Applications 44, 9 (2022), 875--886.

[8]

James Bergstra and Yoshua Bengio. 2012. Random search for hyper-parameter optimization. Journal of machine learning research 13, 2 (2012).

[9]

Aleksandar Botev, Guy Lever, and David Barber. 2017. Nesterov's accelerated gradient and momentum as approximations to regularised update descent. In 2017 International joint conference on neural networks (IJCNN). IEEE, 1899--1903.

[10]

Alebachew Chiche and Betselot Yitagesu. 2022. Part of speech tagging: a systematic review of deep learning and machine learning approaches. Journal of Big Data 9, 1 (2022), 1--25.

[11]

Xiaoliang Dai, Alvin Wan, Peizhao Zhang, Bichen Wu, Zijian He, Zhen Wei, Kan Chen, Yuandong Tian, Matthew Yu, Peter Vajda, et al. 2020. Fbnetv3: Joint architecture-recipe search using neural acquisition function. arXiv preprint arXiv:2006.02049 1, 2 (2020), 3.

[12]

Travis Desell. 2017. Developing a volunteer computing project to evolve convolutional neural networks and their hyperparameters. In 2017 IEEE 13th International Conference on e-Science (e-Science). IEEE, 19--28.

[13]

Travis Desell, Boleslaw Szymanski, and Carlos Varela. 2008. An asynchronous hybrid genetic-simplex search for modeling the milky way galaxy using volunteer computing. In Proceedings of the 10th annual conference on Genetic and evolutionary computation. 921--928.

Digital Library

[14]

Xuanyi Dong, Mingxing Tan, Adams Wei Yu, Daiyi Peng, Bogdan Gabrys, and Quoc V Le. 2020. Autohas: Differentiable hyper-parameter and architecture search. arXiv preprint arXiv:2006.03656 4, 5 (2020).

[15]

Romain Egele, Prasanna Balaprakash, Isabelle Guyon, Venkatram Vishwanath, Fangfang Xia, Rick Stevens, and Zhengying Liu. 2021. AgEBO-tabular: joint neural architecture and hyperparameter search with autotuned data-parallel training for tabular data. In Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis. 1--14.

Digital Library

[16]

Katharina Eggensperger, Matthias Feurer, Frank Hutter, James Bergstra, Jasper Snoek, Holger Hoos, Kevin Leyton-Brown, et al. 2013. Towards an empirical foundation for assessing bayesian optimization of hyperparameters. In NIPS workshop on Bayesian Optimization in Theory and Practice, Vol. 10.

[17]

AbdElRahman ElSaid, Steven Benson, Shuchita Patwardhan, David Stadem, and Travis Desell. 2019. Evolving recurrent neural networks for time series data prediction of coal plant parameters. In International Conference on the Applications of Evolutionary Computation (Part of EvoStar). Springer, 488--503.

[18]

AbdElRahman ElSaid, Joshua Karnas, Zimeng Lyu, Daniel Krutz, Alexander G Ororbia, and Travis Desell. 2020. Neuro-Evolutionary Transfer Learning through Structural Adaptation. In International Conference on the Applications of Evolutionary Computation (Part of EvoStar). Springer, 610--625.

Digital Library

[19]

AbdElRahman ElSaid, Joshua Karns, Zimeng Lyu, Daniel Krutz, Alexander Ororbia, and Travis Desell. 2020. Improving neuroevolutionary transfer learning of deep recurrent neural networks through network-aware adaptation. In Proceedings of the 2020 Genetic and Evolutionary Computation Conference. 315--323.

Digital Library

[20]

Eyad Elyan, Pattaramon Vuttipittayamongkol, Pamela Johnston, Kyle Martin, Kyle McPherson, Chrisina Jayne, Mostafa Kamal Sarker, et al. 2022. Computer vision and machine learning for medical image analysis: recent advances, challenges, and way forward. Artificial Intelligence Surgery 2 (2022).

[21]

Stefan Falkner, Aaron Klein, and Frank Hutter. 2018. BOHB: Robust and efficient hyperparameter optimization at scale. In International Conference on Machine Learning. PMLR, 1437--1446.

[22]

Yesmina Jaafra, Jean Luc Laurent, Aline Deruyver, and Mohamed Saber Naceur. 2019. Reinforcement learning for neural architecture search: A review. Image and Vision Computing 89 (2019), 57--66.

Digital Library

[23]

Xia Jiang and Chuhan Xu. 2022. Deep Learning and Machine Learning with Grid Search to Predict Later Occurrence of Breast Cancer Metastasis Using Clinical Data. Journal of Clinical Medicine 11, 19 (2022), 5772.

[24]

S Vinila Jinny and Yash Vijay Mate. 2021. Early prediction model for coronary heart disease using genetic algorithms, hyper-parameter optimization and machine learning techniques. Health and Technology 11 (2021), 63--73.

[25]

Tinu Theckel Joy, Santu Rana, Sunil Gupta, and Svetha Venkatesh. 2016. Hyperparameter tuning for big data using Bayesian optimisation. In 2016 23rd International Conference on Pattern Recognition (ICPR). IEEE, 2574--2579.

[26]

Asharul Islam Khan and Salim Al-Habsi. 2020. Machine learning in computer vision. Procedia Computer Science 167 (2020), 1444--1451.

Digital Library

[27]

Diederik P Kingma and Jimmy Ba. 2014. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014).

[28]

Aaron Klein, Stefan Falkner, Simon Bartels, Philipp Hennig, and Frank Hutter. 2017. Fast bayesian optimization of machine learning hyperparameters on large datasets. In Artificial intelligence and statistics. PMLR, 528--536.

[29]

Liam Li, Mikhail Khodak, Maria-Florina Balcan, and Ameet Talwalkar. 2020. Geometry-aware gradient algorithms for neural architecture search. arXiv preprint arXiv:2004.07802 (2020).

[30]

Zimeng Lyu, Shuchita Patwardhan, David Stadem, James Langfeld, Steve Benson, Seth Thoelke, and Travis Desell. 2021. Neuroevolution of recurrent neural networks for time series forecasting of coal-fired power plant operating parameters. In Proceedings of the Genetic and Evolutionary Computation Conference Companion. 1735--1743.

Digital Library

[31]

Dougal Maclaurin, David Duvenaud, and Ryan Adams. 2015. Gradient-based hyperparameter optimization through reversible learning. In International conference on machine learning. PMLR, 2113--2122.

[32]

Rafael G Mantovani, André LD Rossi, Joaquin Vanschoren, Bernd Bischl, and André CPLF De Carvalho. 2015. Effectiveness of random search in SVM hyperparameter tuning. In 2015 International Joint Conference on Neural Networks (IJCNN). Ieee, 1--8.

[33]

John A Nelder and Roger Mead. 1965. A simplex method for function minimization. The computer journal 7, 4 (1965), 308--313.

[34]

Alexander Ororbia, AbdElRahman ElSaid, and Travis Desell. 2019. Investigating recurrent neural network memory structures using neuro-evolution. In Proceedings of the genetic and evolutionary computation conference. 446--455.

Digital Library

[35]

Mercy Prasanna Ranjit, Gopinath Ganapathy, Kalaivani Sridhar, and Vikram Arumugham. 2019. Efficient deep learning hyperparameter tuning using cloud infrastructure: Intelligent distributed hyperparameter tuning with Bayesian optimization in the cloud. In 2019 IEEE 12th international conference on cloud computing (CLOUD). IEEE, 520--522.

[36]

Romil Rawat, Yagya Nath Rimal, P William, Snehil Dahima, Sonali Gupta, and K Sakthidasan Sankaran. 2022. Malware Threat Affecting Financial Organization Analysis Using Machine Learning Approach. International Journal of Information Technology and Web Engineering (IJITWE) 17, 1 (2022), 1--20.

[37]

Pengzhen Ren, Yun Xiao, Xiaojun Chang, Po-Yao Huang, Zhihui Li, Xiaojiang Chen, and Xin Wang. 2021. A comprehensive survey of neural architecture search: Challenges and solutions. ACM Computing Surveys (CSUR) 54, 4 (2021), 1--34.

[38]

Rochester Institute of Technology. 2022. Research Computing Services.

[39]

BH Shekar and Guesh Dagnew. 2019. Grid search-based hyperparameter tuning and classification of microarray cancer data. In 2019 second international conference on advanced computational and communication paradigms (ICACCP). IEEE, 1--8.

[40]

Iwan Syarif, Adam Prugel-Bennett, and Gary Wills. 2016. SVM parameter optimization using grid search and genetic algorithm to improve classification performance. TELKOMNIKA (Telecommunication Computing Electronics and Control) 14, 4 (2016), 1502--1509.

[41]

Tijmen Tieleman and Geoffrey Hinton. 2012. Rmsprop: Divide the gradient by a running average of its recent magnitude. coursera: Neural networks for machine learning. COURSERA Neural Networks Mach. Learn 17 (2012).

[42]

Ananto Setyo Wicaksono and Ahmad Afif Supianto. 2018. Hyper parameter optimization using genetic algorithm on machine learning methods for online news popularity prediction. International Journal of Advanced Computer Science and Applications 9, 12 (2018).

[43]

Li Yang and Abdallah Shami. 2020. On hyperparameter optimization of machine learning algorithms: Theory and practice. Neurocomputing 415 (2020), 295--316.

[44]

Steven R Young, Derek C Rose, Thomas P Karnowski, Seung-Hwan Lim, and Robert M Patton. 2015. Optimizing deep learning hyper-parameters through an evolutionary algorithm. In Proceedings of the workshop on machine learning in high-performance computing environments. 1--5.

Digital Library

[45]

Arber Zela, Aaron Klein, Stefan Falkner, and Frank Hutter. 2018. Towards automated deep learning: Efficient joint neural architecture and hyperparameter search. arXiv preprint arXiv:1807.06906 (2018).

[46]

Hongpeng Zhou, Minghao Yang, Jun Wang, and Wei Pan. 2019. Bayesnas: A bayesian approach for neural architecture search. In International conference on machine learning. PMLR, 7603--7613.

Cited By

Pinchuk MKirgizov GYamshchikova LNikitin NDeeva IShakhkyan KBorisov IZharkov KKalyuzhnaya ALi XHandl J(2024)GOLEM: Flexible Evolutionary Design of Graph Representations of Physical and Digital ObjectsProceedings of the Genetic and Evolutionary Computation Conference Companion10.1145/3638530.3664141(1668-1675)Online publication date: 14-Jul-2024
https://dl.acm.org/doi/10.1145/3638530.3664141
Thakur AAwari ALyu ZDesell T(2023)Efficient Neuroevolution Using Island Repopulation and Simplex Hyperparameter Optimization2023 IEEE Symposium Series on Computational Intelligence (SSCI)10.1109/SSCI52147.2023.10371872(1837-1842)Online publication date: 5-Dec-2023
https://doi.org/10.1109/SSCI52147.2023.10371872

Index Terms

Co-evolving Recurrent Neural Networks and their Hyperparameters with Simplex Hyperparameter Optimization
1. Applied computing
  1. Operations research
    1. Forecasting
2. Computing methodologies
  1. Machine learning
    1. Learning settings
      1. Online learning settings
    2. Machine learning approaches
      1. Bio-inspired approaches
        Genetic algorithms
      2. Neural networks

Recommendations

ONE-NAS: an online neuroevolution based neural architecture search for time series forecasting
GECCO '22: Proceedings of the Genetic and Evolutionary Computation Conference Companion

Time series forecasting (TSF) is one of the most important tasks in data science, as accurate time series (TS) predictions can drive and advance a wide variety of domains including finance, transportation, health care, and power systems. However, real-...
Online evolutionary neural architecture search for multivariate non-stationary time series forecasting
Abstract
Time series forecasting (TSF) is one of the most important tasks in data science. TSF models are usually pre-trained with historical data and then applied on future unseen datapoints. However, real-world time series data is usually non-...
Graphical abstract

Display Omitted
Highlights
- We present ONE-NAS, the first algorithm capable of designing and training RNNs in real-time as data arrives as an online fashion.
Using ant colony optimization to optimize long short-term memory recurrent neural networks
GECCO '18: Proceedings of the Genetic and Evolutionary Computation Conference

This work examines the use of ant colony optimization (ACO) to improve long short-term memory (LSTM) recurrent neural networks (RNNs) by refining their cellular structure. The evolved networks were trained on a large database of flight data records ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

GECCO '23 Companion: Proceedings of the Companion Conference on Genetic and Evolutionary Computation

July 2023

2519 pages

ISBN:9798400701207

DOI:10.1145/3583133

Chair:
Sara Silva,
Program Chair:
Luís Paquete

Copyright © 2023 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

SIGEVO: ACM Special Interest Group on Genetic and Evolutionary Computation

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 24 July 2023

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Funding Sources

Conference

GECCO '23 Companion

Sponsor:

SIGEVO

GECCO '23 Companion: Companion Conference on Genetic and Evolutionary Computation

July 15 - 19, 2023

Lisbon, Portugal

Acceptance Rates

Overall Acceptance Rate 1,669 of 4,410 submissions, 38%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

1
Total Citations
View Citations
180
Total Downloads

Downloads (Last 12 months)131
Downloads (Last 6 weeks)18

Reflects downloads up to 02 Mar 2025

Other Metrics

View Author Metrics

Citations

Cited By

Pinchuk MKirgizov GYamshchikova LNikitin NDeeva IShakhkyan KBorisov IZharkov KKalyuzhnaya ALi XHandl J(2024)GOLEM: Flexible Evolutionary Design of Graph Representations of Physical and Digital ObjectsProceedings of the Genetic and Evolutionary Computation Conference Companion10.1145/3638530.3664141(1668-1675)Online publication date: 14-Jul-2024
https://dl.acm.org/doi/10.1145/3638530.3664141
Thakur AAwari ALyu ZDesell T(2023)Efficient Neuroevolution Using Island Repopulation and Simplex Hyperparameter Optimization2023 IEEE Symposium Series on Computational Intelligence (SSCI)10.1109/SSCI52147.2023.10371872(1837-1842)Online publication date: 5-Dec-2023
https://doi.org/10.1109/SSCI52147.2023.10371872

View Options

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Figures

Tables

Media

View Table of Conten