Experimental Comparison of Stochastic Optimizers in Deep Learning

Okewu, Emmanuel; Adewole, Philip; Sennaike, Oladipupo

doi:10.1007/978-3-030-24308-1_55

Experimental Comparison of Stochastic Optimizers in Deep Learning

Emmanuel Okewu²⁴,
Philip Adewole²⁵ &
Oladipupo Sennaike²⁵

Conference paper
First Online: 29 June 2019

2378 Accesses
12 Citations

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 11623))

Abstract

The stochastic optimization problem in deep learning involves finding optimal values of loss function and neural network parameters using a meta-heuristic search algorithm. The fact that these values cannot be reasonably obtained by using a deterministic optimization technique underscores the need for an iterative method that randomly picks data segments, arbitrarily determines initial values of optimization (network) parameters and steadily computes series of error functions until a tolerable error is attained. The typical stochastic optimization algorithm for training deep neural networks as a non-convex optimization problem is gradient descent. It has existing extensions like Stochastic Gradient Descent, Adagrad, Adadelta, RMSProp and Adam. In terms of accuracy, convergence rate and training time, each of these stochastic optimizers represents an improvement. However, there is room for further improvement. This paper presents outcomes of series of experiments conducted with a view to providing empirical evidences of successes made so far. We used Python deep learning libaries (Tensorflow and Keras API) for our experiments. Each algorithm is executed, results collated, and a case made for further research in deep learning to improve training time and convergence rate of deep neural network, as well as accuracy of outcomes. This is in response to the growing demands for deep learning in mission-critical and highly sophisticated decision making processes across industry verticals.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

Glorot, X., Bengio, Y.: Understanding the difficulty of training deep feedforward neural networks. DIRO, Université de Montréal, Montréal (2010)
Google Scholar
Bengio, Y., LeCun, Y., Hinton, G.: Deep learning. Nature 521(7553), 436–444 (2015). https://doi.org/10.1038/nature14539. Bibcode:2015Natur.521..436L. PMID 26017442
Article Google Scholar
Koushik, J., Hayashi, H.: Improving stochastic gradient descent with feedback. In: Conference Paper at ICLR 2017
Google Scholar
Duchi, J., Hazan, E., Singer, Y.: Adaptive subgradient methods for online learning and stochastic optimization. J. Mach. Learn. Res. 12(Jul), 2121–2159 (2011)
Google Scholar
Kim, D., Fessler, J.A.: Optimized first-order methods for smooth convex minimization. Math. Prog. 151, 8–107 (2016)
Google Scholar
Walia, A.S.: Types of Optimization Algorithms used in Neural Networks and Ways to Optimize Gradient Descent (2017)
Google Scholar
Shridhar, K.: A Beginners Guide to Deep Learning (2017)
Google Scholar
Tran, D.T., Iosifidis, A., Gabbouj, M.: Improving efficiency in convolutional neural networks with multilinear filters. Neural Netw. 105, 328–339 (2018)
Article Google Scholar
Li, J., Zhou, T., Wang, C.: On global convergence of gradient descent algorithms for generalized phase retrieval problem. J. Comput. Appl. Math. 329, 202–222 (2018)
Article MathSciNet Google Scholar
Anandkumar, A.: Nonconvex optimization: challenges and recent successes. In: ICML 2016 Tutorial
Google Scholar
Kingma, D., Ba, J.: Adam: a method for stochastic optimization. In: Published as a Conference Paper at ICLR 2015
Google Scholar
Brownlee, J.: How to setup a python environment for machine learning and deep learning with anaconda. In: Python Machine Learning (2017)
Google Scholar
Li, G., et al.: Training deep neural networks with discrete state transition. Neurocomputing 272, 154–162 (2018)
Article Google Scholar
Schmidhuber, J.: Deep learning in neural networks: an overview. Neural Netw. 61, 85–117 (2015). arXiv:1404.7828 . https://doi.org/10.1016/j.neunet.2014.09.003. PMID 25462637
Article Google Scholar
Sutskever, I.: Training recurrent neural networks (PDF). Ph.D., University of Toronto, p. 74 (2013)
Google Scholar
Mei, S.: A mean field view of the landscape of two-layer neural networks. In: Proceedings of the National Academy of Sciences (2018)
Google Scholar
Robbins, H., Monro, S.: For developing SGD in their 1951 article titled “A Stochastic Approximation Method” (1951)
Google Scholar
Sutskever, I., Martens, J., Dahl, G., Hinton, G.E.: On the importance of initialization and momentum in deep learning’(PDF). In: Dasgupta, S., Mcallester, D. (eds.) Proceedings of the 30th International Conference on Machine Learning (ICML-13), Atlanta, GA, vol. 28, pp. 1139–1147. Accessed 14 Jan 2016
Google Scholar
Rumelhart, D.E., Hinton, G.E., Williams, R.J.: Learning representations by back-propagating errors. Nature 323(6088), 533–536 (1986)
Article Google Scholar
Nesterov (1983)
Google Scholar
Zeiler, M.D.: Adadelta: an adaptive learning rate method. arXiv preprint arXiv:1212.5701 (2012)
Tieleman, T., Hinton, G.: Lecture 6.5-rmsprop: divide the gradient by a running average of its recent magnitude. COURSERA: Neural Netw. Mach. Learn. 4(2), 26–31 (2012)
Google Scholar
Graves, A.: Generating Sequences with Recurrent Neural Networks (2014)
Google Scholar
Yalçın, O.G.: Image Classification in 10 Minutes with MNIST Dataset (2018)
Google Scholar
Torres, J.: Convolutional Neural Networks for Beginners. Practical Guide with Python and Keras (2018)
Google Scholar

Download references

Author information

Authors and Affiliations

Centre for Information Technology and Systems, University of Lagos, Lagos, Nigeria
Emmanuel Okewu
Department of Computer Sciences, University of Lagos, Lagos, Nigeria
Philip Adewole & Oladipupo Sennaike

Authors

Emmanuel Okewu
View author publications
You can also search for this author in PubMed Google Scholar
Philip Adewole
View author publications
You can also search for this author in PubMed Google Scholar
Oladipupo Sennaike
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Emmanuel Okewu .

Editor information

Editors and Affiliations

Covenant University, Ota, Nigeria
Sanjay Misra
University of Perugia, Perugia, Italy
Osvaldo Gervasi
University of Basilicata, Potenza, Italy
Beniamino Murgante
Saint Petersburg State University, Saint Petersburg, Russia
Elena Stankova
Saint Petersburg State University, Saint Petersburg , Russia
Vladimir Korkhov
Polytechnic University of Bari, Bari, Italy
Carmelo Torre
University of Minho, Braga, Portugal
Ana Maria A.C. Rocha
Monash University, Clayton, VIC, Australia
David Taniar
Kyushu Sangyo University, Fukuoka, Japan
Bernady O. Apduhan
Polytechnic University of Bari, Bari, Italy
Eufemia Tarantino

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Okewu, E., Adewole, P., Sennaike, O. (2019). Experimental Comparison of Stochastic Optimizers in Deep Learning. In: Misra, S., et al. Computational Science and Its Applications – ICCSA 2019. ICCSA 2019. Lecture Notes in Computer Science(), vol 11623. Springer, Cham. https://doi.org/10.1007/978-3-030-24308-1_55

Download citation

DOI: https://doi.org/10.1007/978-3-030-24308-1_55
Published: 29 June 2019
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-24307-4
Online ISBN: 978-3-030-24308-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics