Abstract
The learning algorithms related to deep learning involves many attributes called hyperparameters, these variables help in determining the network structure. The performance of algorithms depends upon these hyper-parameter variables that are needed to be set prior to the actual implementation of the algorithm. This study involves an overview of some of the commonly used hyperparameters in the context of learning algorithms used for training neural networks along with the analysis of adaptive learning algorithms used for tuning learning rates.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Bergstra, J., Bengio, Y.: Random search for hyper-parameter optimization. J. Mach. Learn. Res. 13, 281–305 (2012)
Bengio, Y.: Practical recommendations for gradient-based training of deep architectures. In: Neural Networks: Tricks of the Trade, pp. 437–478. Springer, Berlin, Heidelberg (2012)
Smith, L.N.: No more pesky learning rate guessing games. CoRR, abs/1506.01186 (2015)
Smith, L.N.: Cyclical learning rates for training neural networks. In: 2017 IEEE Winter Conference on Applications of Computer Vision (WACV), IEEE, pp. 464–472 (2017)
Smith, S.L., Kindermans, P.J., Ying, C., Le, Q.V.: Don’t decay the learning rate, increase the batch size. arXiv preprint arXiv:1711.00489 (2017)
Lorraine, J., Duvenaud, D.: Stochastic hyperparameter optimization through hypernetworks. arXiv preprint arXiv:1802.09419 (2018)
Duchi, J., Hazan, E., Singer, Y.: Adaptive subgradient methods for online learning and stochastic optimization. J. Mach. Learn. Res. 12, 2121–2159 (2011)
Zeiler, M. D.: ADADELTA: an adaptive learning rate method. arXiv preprint arXiv:1212.5701 (2012)
Tieleman, T., Hinton, G.: Lecture 6.5-rmsprop: divide the gradient by a running average of its recent magnitude. COURSERA Neural Netw. Mach. Learn. 4(2), 26–31 (2012)
Kingma, D.P., Ba, J.L.: Adam: a method for stochastic optimization. In: International Conference on Learning Representations, pp. 1–13 (2015)
Dozat, T.: Incorporating nesterov momentum into adam 2016
Ruder, S.: An overview of gradient descent optimization algorithms. arXiv preprint arXiv:1609.04747 (2016)
Tiwari, A., Bhateja, V., Gautam, A., Satapathy, S.C.: ANN-based classification of mammograms using nonlinear preprocessing. In: Proceedings of 2nd International Conference on Micro-Electronics, Electromagnetics and Telecommunications, pp. 375–382. Springer, Singapore (2018)
Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. In: Advances in Neural Information Processing Systems, pp. 1097–1105 (2012)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Makwe, A., Rathore, A.S. (2021). An Empirical Study of Neural Network Hyperparameters. In: Bhateja, V., Peng, SL., Satapathy, S.C., Zhang, YD. (eds) Evolution in Computational Intelligence. Advances in Intelligent Systems and Computing, vol 1176. Springer, Singapore. https://doi.org/10.1007/978-981-15-5788-0_36
Download citation
DOI: https://doi.org/10.1007/978-981-15-5788-0_36
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-15-5787-3
Online ISBN: 978-981-15-5788-0
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)