Skip to main content

A Novel Adaptive Learning Rate Algorithm for Convolutional Neural Network Training

  • Conference paper
  • First Online:

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 744))

Abstract

In this work an adaptive learning rate algorithm for Convolutional Neural Networks is presented. Harvesting already computed first order information of the gradient vectors of three consecutive iterations during the training phase, an adaptive learning rate is calculated. The learning rate is increasing proportionally to the similarity of the direction of the gradients in an attempt to accelerate the convergence and locate a good solution. The proposed algorithm is suitable for the time-consuming training of the Convolutional Neural Networks, alleviating the exhaustive and critical for the performance of trained network heuristic search for a suitable learning rate. The experimental results indicate that the proposed algorithm produces networks having good classification accuracy, regardless the initial learning rate value. Moreover, the training procedure is similar or better to the gradient descent algorithm with fixed heuristically chosen learning rate.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

  1. 1.

    The code is available on https://github.com/Georgakopoulos-Sp/.

References

  1. Almeidaa, L.B., Langloisa, T., Amaral, J.D., Plakhov, A.: Parameter adaptation in stochastic optimization. In: On-line Learning in Neural Networks, pp. 111–134. Cambridge University Press (1998)

    Google Scholar 

  2. Ba, J., Kingma, D.: Adam: a method for stochastic optimization. In: International Conference on Learning Representations (2015)

    Google Scholar 

  3. Bottou, L.: On-line learning and stochastic approximations. In: On-line Learning in Neural Networks, pp. 9–42. Cambridge University Press (1998)

    Google Scholar 

  4. Delibasis, K.K., Georgakopoulos, S.V., Kottari, K., Plagianakos, V.P., Maglogiannis, I.: Geodesically-corrected zernike descriptors for pose recognition in omni-directional images. Integr. Comput.-Aided Eng. 23(2), 185–199 (2016)

    Article  Google Scholar 

  5. Deng, J., Dong, W., Socher, R., Li, L.J., Li, K., Fei-Fei, L.: ImageNet: a large-scale hierarchical image database. In: CVPR 2009 (2009)

    Google Scholar 

  6. Duchi, J., Hazan, E., Singer, Y.: Adaptive subgradient methods for online learning and stochastic optimization. J. Mach. Learn. Res. 12, 2121–2159 (2011)

    MathSciNet  MATH  Google Scholar 

  7. Fukushima, K.: Neocognitron: a self-organizing neural network model for a mechanism of pattern recognition unaffected by shift in position. Biol. Cybern. 36(4), 193–202 (1980)

    Article  MATH  Google Scholar 

  8. Georgakopoulos, S.V., Iakovidis, D.K., Vasilakakis, M., Plagianakos, V.P., Koulaouzidis, A.: Weakly-supervised convolutional learning for detection of inflammatory gastrointestinal lesions. In: 2016 IEEE International Conference on Imaging Systems and Techniques (IST), pp. 510–514, October 2016

    Google Scholar 

  9. Jia, Y., Shelhamer, E., Donahue, J., Karayev, S., Long, J., Girshick, R., Guadarrama, S., Darrell, T.: Caffe: convolutional architecture for fast feature embedding. arXiv preprint arXiv:1408.5093 (2014)

  10. Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. In: Pereira, F., Burges, C.J.C., Bottou, L., Weinberger, K.Q. (eds.) Advances in Neural Information Processing Systems, vol. 25, pp. 1097–1105. Curran Associates, Inc. (2012)

    Google Scholar 

  11. LeCun, Y., Boser, B., Denker, J.S., Henderson, D., Howard, R.E., Hubbard, W., Jackel, L.D.: Backpropagation applied to handwritten zip code recognition. Neural Comput. 1(4), 541–551 (1989)

    Article  Google Scholar 

  12. Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proc. IEEE 86(11), 2278–2324 (1998)

    Article  Google Scholar 

  13. Magoulas, G., Plagianakos, V., Vrahatis, M.: Adaptive stepsize algorithms for on-line training of neural networks. Nonlinear Anal.: Theory Methods Appl. 47(5), 3425–3430 (2001)

    Article  MathSciNet  MATH  Google Scholar 

  14. Nair, V., Hinton, G.E.: Rectified linear units improve restricted Boltzmann machines. In: Fnkranz, J., Joachims, T. (eds.) Proceedings of 27th International Conference on Machine Learning (ICML-2010), pp. 807–814. Omnipress (2010)

    Google Scholar 

  15. Oquab, M., Bottou, L., Laptev, I., Sivic, J.: Is object localization for free? - weakly-supervised learning with convolutional neural networks. In: 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 685–694, June 2015

    Google Scholar 

  16. Plagianakos, V.P., Magoulas, G.D., Vrahatis, M.N.: Global learning rate adaptation in on-line neural network training. In: Proceedings of 2nd International ICSC Symposium on Neural Computation (NC 2000), Berlin, Germany (2000)

    Google Scholar 

  17. Torralba, A., Fergus, R., Freeman, W.T.: 80 million tiny images: a large data set for nonparametric object and scene recognition. IEEE Trans. Pattern Anal. Mach. Intell. 30(11), 1958–1970 (2008). http://dx.doi.org/10.1109/TPAMI.2008.128

  18. Wilcoxon, F.: Individual comparisons by ranking methods. Biom. Bull. 1(6), 80–83 (1945)

    Article  Google Scholar 

  19. Zeiler, M.D.: ADADELTA: an adaptive learning rate method. CoRR abs/1212.5701 (2012)

    Google Scholar 

Download references

Acknowledgments

We gratefully acknowledge the support of NVIDIA Corporation with the donation of the Titan X Pascal GPU used for this research. This work was partially supported by a Hellenic Artificial Intelligence Society (EETN) scholarship.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to S. V. Georgakopoulos .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2017 Springer International Publishing AG

About this paper

Cite this paper

Georgakopoulos, S.V., Plagianakos, V.P. (2017). A Novel Adaptive Learning Rate Algorithm for Convolutional Neural Network Training. In: Boracchi, G., Iliadis, L., Jayne, C., Likas, A. (eds) Engineering Applications of Neural Networks. EANN 2017. Communications in Computer and Information Science, vol 744. Springer, Cham. https://doi.org/10.1007/978-3-319-65172-9_28

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-65172-9_28

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-65171-2

  • Online ISBN: 978-3-319-65172-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics