SPSA for Layer-Wise Training of Deep Networks

Wulff, Benjamin; Schuecker, Jannis; Bauckhage, Christian

doi:10.1007/978-3-030-01424-7_55

SPSA for Layer-Wise Training of Deep Networks

Benjamin Wulff^18,19,
Jannis Schuecker^18,19 &
Christian Bauckhage^18,19,20

Conference paper
First Online: 27 September 2018

8695 Accesses
1 Citations

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 11141))

Abstract

Concerned with neural learning without backpropagation, we investigate variants of the simultaneous perturbation stochastic approximation (SPSA) algorithm. Experimental results suggest that these allow for the successful training of deep feed-forward neural networks using forward passes only. In particular, we find that SPSA-based algorithms which update network parameters in a layer-wise manner are superior to variants which update all weights simultaneously.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

Baldi, P., Sadowski, P., Lu, Z.: Learning in the machine: random backpropagation and the deep learning channel. arXiv:1612.02734 [cs.LG] (2016)
Bauckhage, C., Thurau, C.: Making archetypal analysis practical. In: Denzler, J., Notni, G., Süße, H. (eds.) DAGM 2009. LNCS, vol. 5748, pp. 272–281. Springer, Heidelberg (2009). https://doi.org/10.1007/978-3-642-03798-6_28
Chapter Google Scholar
Bengio, Y., Lamblin, P., Popovic, D., Larochelle, H.: Greedy layer-wise training of deep networks. In: Proceedings NIPS (2006)
Google Scholar
Choy, M., Srinivasan, D., Cheu, R.: Neural networks for continuous online learning and control. IEEE Trans. Neural Netw. 17(6), 2006 (2006)
Google Scholar
Courbariaux, M., Bengio, Y., David, J.P.: Training deep neural networks with low precision multiplications. arXiv:1412.7024 [cs.LG] (2014)
Garipov, T., Izmailov, P., Podoprikhin, D., Vetrov, D., Wilson, A.: Loss surfaces, mode connectivity, and fast ensembling of DNNs. arXiv:1802.10026 [stat.ML] (2018)
Hinton, G., Osindero, S., Teh, Y.: A fast learning algorithm for deep belief nets. In: Proceedings NIPS (2006)
Article MathSciNet Google Scholar
Hochreiter, S., Schmidhuber, J.: Flat minima. Neural Comput. 9(1), 1–42 (1997)
Article Google Scholar
Hooke, R., Jeeves, T.: Direct search solution of numerical and statistical problems. J. ACM 8(2), 212–229 (1961)
Article Google Scholar
Izmailov, P., Garipov, D.P.T., Vetrov, D., Wilson, A.: Averaging weights leads to wider optima and better generalization. arXiv:1803.05407 [cs.LG] (2018)
Jaderberg, M., et al.: Decoupled neural interfaces using synthetic gradients. arXiv:1608.05343 [cs.LG] (2016)
Kiefer, J., Wolfowitz, J.: Estimation of the maximum of a regression function. Ann. Math. Stat. 23(3), 462–466 (1952)
Article MathSciNet Google Scholar
LeCun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proc. IEEE 86, 2278–2324 (1998)
Article Google Scholar
Lillicrap, T., Cownden, D., Tweed, D., Akerman, J.: Random synaptic feedback weights support error backpropagation for deep learning. Nat. Commun. 7(13276) (2016)
Article Google Scholar
Loshchilov, I., Hutter, F.: SGDR: stochastic gradient descent with warm restarts. In: Proceedings ICLR (2017)
Google Scholar
Nelder, J., Mead, R.: A simplex method for function minimization. Comput. J. 7(4), 308–313 (1965)
Article MathSciNet Google Scholar
Robbins, H., Monro, S.: A stochastic approximation method. Ann. Math. Stat. 22(3), 400–407 (1951)
Article MathSciNet Google Scholar
Rosenfeld, A., Tsotsos, J.: Intriguing properties of randomly weighted networks: generalizing while learning next to nothing. arXiv:1802.00844 [cs.LG] (2018)
Rummelhart, D., Hinton, G., Williams, R.: Learning representations by back-propagating errors. Nature 323(6088), 533–536 (1986)
Article Google Scholar
Sehnke, F., Osendorfer, C., Rückstieß, T., Graves, A., Peters, J., Schmidhuber, J.: Policy gradients with parameter-based exploration for control. In: Kůrková, V., Neruda, R., Koutník, J. (eds.) ICANN 2008. LNCS, vol. 5163, pp. 387–396. Springer, Heidelberg (2008). https://doi.org/10.1007/978-3-540-87536-9_40
Chapter Google Scholar
Smith, L.: Cyclical learning rates for training neural networks. In: Proceedings Winter Conference on Applications of Computer Vision. IEEE (2017)
Google Scholar
Song, Q., Spall, J., Soh, Y.C., Nie, J.: Robust neural network tracking controller using simultaneous perturbation stochastic approximation. IEEE Trans. Neural Netw. 19(5), 817–835 (2008)
Article Google Scholar
Spall, J.: Multivariate stochastic approximation using a simultaneous perturbation gradient approximation. IEEE Trans. Autom. Control 37(3), 332–341 (1992)
Article MathSciNet Google Scholar
Spall, J.: Introduction to Stochastic Search and Optimization: Estimation, Simulation, and Control. Wiley, Hoboken (2003)
Book Google Scholar
Taylor, G., Burmeister, R., Xu, Z., Singh, B., Patel, A., Goldstein, T.: Training neural networks without gradients: a scalable ADMM approach. In: Proceedings ICML (2016)
Google Scholar
Thurau, C., Kersting, K., Wahabzada, M., Bauckhage, C.: Convex non-negative matrix factorization for massive datasets. Knowl. Inf. Syst. 29(2), 457–478 (2011)
Article Google Scholar
Vande Wouver, A., Renotte, C., Remy, M.: On the use of simultaneuous perturbation stochastic approximation for neural network training. In: Proceedings American Control Conference. IEEE (1999)
Google Scholar
Williams, R.: Simple statistical gradient-following algorithms for connectionist reinforcement learning. Mach. Learn. 8(3–4), 229–256 (1992)
MATH Google Scholar

Download references

Author information

Authors and Affiliations

Fraunhofer Center for Machine Learning, Sankt Augustin, Germany
Benjamin Wulff, Jannis Schuecker & Christian Bauckhage
Fraunhofer IAIS, Sankt Augustin, Germany
Benjamin Wulff, Jannis Schuecker & Christian Bauckhage
B-IT, University of Bonn, Bonn, Germany
Christian Bauckhage

Authors

Benjamin Wulff
View author publications
You can also search for this author in PubMed Google Scholar
Jannis Schuecker
View author publications
You can also search for this author in PubMed Google Scholar
Christian Bauckhage
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Benjamin Wulff .

Editor information

Editors and Affiliations

Czech Academy of Sciences, Prague 8, Czech Republic
Věra Kůrková
Open University of Cyprus, Latsia, Cyprus
Yannis Manolopoulos
CITEC Bielefeld University, Bielefeld, Germany
Barbara Hammer
Democritus University of Thrace, Xanthi, Greece
Lazaros Iliadis
University of Piraeus, Piraeus, Greece
Ilias Maglogiannis

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Wulff, B., Schuecker, J., Bauckhage, C. (2018). SPSA for Layer-Wise Training of Deep Networks. In: Kůrková, V., Manolopoulos, Y., Hammer, B., Iliadis, L., Maglogiannis, I. (eds) Artificial Neural Networks and Machine Learning – ICANN 2018. ICANN 2018. Lecture Notes in Computer Science(), vol 11141. Springer, Cham. https://doi.org/10.1007/978-3-030-01424-7_55

Download citation

DOI: https://doi.org/10.1007/978-3-030-01424-7_55
Published: 27 September 2018
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-01423-0
Online ISBN: 978-3-030-01424-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics