Abstract
Neural networks stand out in Artificial Intelligence for their capacity of being applied to multiple challenging tasks such as image classification. However, designing a neural network to address a particular problem is also a demanding task that requires expertise and time-consuming trial-and-error stages. The design of methods to automate the designing of neural networks define a research field that generally relies on different optimization algorithms, such as population meta-heuristics. This work studies utilizing Teaching-Learning-based Optimization (TLBO), which had not been used before for this purpose up to the authors’ knowledge. It is widespread and does not have specific parameters. Besides, it would be compatible with deep neural network design, i.e., architectures with many layers, due to its conception as a large-scale optimizer. A new encoding scheme has been proposed to make this continuous optimizer compatible with neural network design. This method, which is of general application, i.e., not linked to TLBO, can represent different network architectures with a plain vector of real values. A compatible objective function that links the optimizer and the representation of solutions has also been developed. The performance of this framework has been studied by addressing the design of an image classification neural network based on the CIFAR-10 dataset. The achieved result outperforms the initial solutions designed by humans after letting them evolve.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Byla, E., Pang, W.: DeepSwarm: optimising convolutional neural networks using swarm intelligence. In: Ju, Z., Yang, L., Yang, C., Gegov, A., Zhou, D. (eds.) UKCI 2019. AISC, vol. 1043, pp. 119–130. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-29933-0_10
Chen, Z., Zhou, Y., Huang, Z.: Auto-creation of effective neural network architecture by evolutionary algorithm and resnet for image classification. In: 2019 IEEE International Conference on Systems, Man and Cybernetics (SMC), pp. 3895–3900 (2019)
Cruz, N.C., Álvarez, J.D., Redondo, J.L., Berenguel, M., Ortigosa, P.M.: A two-layered solution for automatic heliostat aiming. Eng. Appl. Artif. Intell. 72, 253–266 (2018)
Cruz, N.C., Marín, M., Redondo, J.L., Ortigosa, E.M., Ortigosa, P.M.: A comparative study of stochastic optimizers for fitting neuron models. Application to the cerebellar granule cell. Informatica 32(3), 477–498 (2021)
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016)
Isola, P., Zhu, J.Ya., Zhou, T., Efros, A.A.: Image-to-image translation with conditional adversarial networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1125–1134 (2017)
Krizhevsky, A., Hinton, G., et al.: Learning multiple layers of features from tiny images (2009)
Liu, Y., Sun, Y., Xue, B., Zhang, M., Yen, G., Tan, K.: A survey on evolutionary neural architecture search. IEEE Trans. Neural Netw. Learn. Syst. PP, 1–21 (2021)
Lu, Z., et al.: NSGA-Net: neural architecture search using multi-objective genetic algorithm. In: Proceedings of the Genetic and Evolutionary Computation Conference, pp. 419–427 (2019)
Rao, R.V., Savsani, V.J., Vakharia, D.P.: Teaching-learning-based optimization: an optimization method for continuous non-linear large scale problems. Inf. Sci. 183(1), 1–15 (2012)
Real, E., et al.: Large-scale evolution of image classifiers. In: International Conference on Machine Learning, pp. 2902–2911. PMLR (2017)
Redmon, J., Divvala, S., Girshick, R., Farhadi, A.: You only look once: unified, real-time object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 779–788 (2016)
Sharma, N., Sharma, R., Jindal, N.: Machine learning and deep learning applications - a vision. Glob. Transit. Proc. 2(1), 24–28 (2021)
Shu, H., Wang, Y.: Automatically searching for u-net image translator architecture (2020)
Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition (2014)
Wang, B., Sun, Y., Xue, B., Zhang, M.: Evolving deep convolutional neural networks by variable-length particle swarm optimization for image classification. In: 2018 IEEE Congress on Evolutionary Computation (CEC), pp. 1–8. IEEE (2018)
Yang, Z., Li, K., Guo, Y., Ma, H., Zheng, M.: Compact real-valued teaching-learning based optimization with the applications to neural network training. Knowl.-Based Syst. 159, 51–62 (2018)
Ye, F.: Particle swarm optimization-based automatic parameter selection for deep neural networks and its applications in large-scale and high-dimensional data. PLoS ONE 12(12), e0188746 (2017)
Yeniay, Ö.: Penalty function methods for constrained optimization with genetic algorithms. Math. Comput. Appl. 10(1), 45–56 (2005)
Zoph, B., Le, Q.V.: Neural architecture search with reinforcement learning (2016)
Acknowledgements
This research has been funded by the R+D+i project RTI2018-095993-B-I00, financed by MCIN/AEI/10.13039/501100011033/ and ERDF “A way to make Europe”; by the Junta de Andalucá with reference P18-RT-1193; by the University of Almería with reference UAL18-TIC-A020-B and by the Department of Informatics of the University of Almería. M. Lupión is supported by FPU program of the Spanish Ministry of Education (FPU19/02756). N.C. Cruz is supported by the Ministry of Economic Transformation, Industry, Knowledge and Universities from the Andalusian government.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Lupión, M., Cruz, N.C., Paechter, B., Ortigosa, P.M. (2023). On Optimizing the Structure of Neural Networks Through a Compact Codification of Their Architecture. In: Di Gaspero, L., Festa, P., Nakib, A., Pavone, M. (eds) Metaheuristics. MIC 2022. Lecture Notes in Computer Science, vol 13838. Springer, Cham. https://doi.org/10.1007/978-3-031-26504-4_10
Download citation
DOI: https://doi.org/10.1007/978-3-031-26504-4_10
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-26503-7
Online ISBN: 978-3-031-26504-4
eBook Packages: Computer ScienceComputer Science (R0)