Solving the Ill-Conditioning in Neural Network Learning

van der Smagt, Patrick; Hirzinger, Gerd

doi:10.1007/978-3-642-35289-8_13

Solving the Ill-Conditioning in Neural Network Learning

Patrick van der Smagt¹⁸ &
Gerd Hirzinger¹⁸

Chapter

65k Accesses
2 Citations

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 7700))

Abstract

In this paper we investigate the feed-forward learning problem. The well-known ill-conditioning which is present in most feed-forward learning problems is shown to be the result of the structure of the network. Also, the well-known problem that weights between ‘higher’ layers in the network have to settle before ‘lower’ weights can converge is addressed. We present a solution to these problems by modifying the structure of the network through the addition of linear connections which carry shared weights. We call the new network structure the linearly augmented feed-forward network, and it is shown that the universal approximation theorems are still valid. Simulation experiments show the validity of the new method, and demonstrate that the new network is less sensitive to local minima and learns faster than the original network.

Previously published in: Orr, G.B. and Müller, K.-R. (Eds.): LNCS 1524, ISBN 978-3-540-65311-0 (1998).

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 109.00; Price excludes VAT (USA)

Softcover Book: USD 139.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Aarts, E., Korst, J.: Simulated Annealing and Boltzmann Machines. John Wiley & Sons (1989)
Google Scholar
Battiti, R.: First- and second-order methods for learning: Between steepest descent and Newton’s method. Neural Computation 4, 141–166 (1992)
Article Google Scholar
Cybenko, G.: Approximation by superpositions of a sigmoidal function. Mathematics of Control, Signals, and Systems 2(4), 303–314 (1989)
Article MathSciNet MATH Google Scholar
DasGupta, B., Siegelmann, H.T., Sontag, E.D.: On the complexity of training neural networks with continuous activation functions. IEEE Transactions on Neural Networks 6(6), 1490–1504 (1995)
Article Google Scholar
Fahlman, S.E.: An empirical study of learning speed in back-propagation networks. Technical Report CMU–CS–88-0-162, Carnegie Mellon University (September 1988)
Google Scholar
Funahashi, K.-I.: On the approximate realization of continuous mappings by neural networks. Neural Networks 2(3), 183–192 (1989)
Article Google Scholar
Hochreiter, S., Schmidhuber, J.: Flat minima. Neural Computation 9(1), 1–42 (1997)
Article MATH Google Scholar
Hornik, K., Stinchcombe, M., White, H.: Multilayer feedforward networks are universal approximators. Neural Networks 2(5), 359–366 (1989)
Article Google Scholar
Robbins, H., Monro, S.: A stochastic approximation method. Annals of Mathematical Statistics 22(1), 400–407 (1951)
Article MathSciNet MATH Google Scholar
Saarinen, S., Bramley, R., Cybenko, G.: Ill-conditioning in neural network training problems. Siam Journal of Scientific Computing 14(3), 693–714 (1993)
Article MathSciNet MATH Google Scholar
Schraudolph, N.N.: On centering neural network weight updates. Technical Report IDSIA-19-97, IDSIA (April 1997)
Google Scholar
Sontag, E.D., Sussmann, H.J.: Backpropagation can give rise to spurious local minima even for networks without hidden layers. Complex Systems 3(1), 91–106 (1989)
MathSciNet MATH Google Scholar
Unnikrishnan, K.P., Venugopal, K.P.: Alopex: A correlation based learning algorithm for feedforward and recurrent neural networks. Neural Computation 6, 469–490 (1994)
Article Google Scholar
van der Smagt, P.: Minimisation methods for training feed-forward networks. Neural Networks 7(1), 1–11 (1994)
Article Google Scholar
van der Smagt, P.: Visual Robot Arm Guidance using Neural Networks. PhD thesis, Dept of Computer Systems, University of Amsterdam (March 1995)
Google Scholar
Zhang, Q.J., Zhang, Y.J., Ye, W.: Local-sparse connection multilayer networks. In: Proc. IEEE Conf. Neural Networks, pp. 1254–1257. IEEE (1995)
Google Scholar

Download references

Author information

Authors and Affiliations

German Aerospace Research Establishment, Institute of Robotics and System Dynamics, P.O. Box 1116, D-82230, Wessling, Germany
Patrick van der Smagt & Gerd Hirzinger

Authors

Patrick van der Smagt
View author publications
You can also search for this author in PubMed Google Scholar
Gerd Hirzinger
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Dept. of Computer Science, Technische Universität Berlin, Franklinstr. 28/29, 10587, Berlin, Germany
Grégoire Montavon & Klaus-Robert Müller &
Dept. of computer Science, Willamette University, 900 State Street, 97301, Salem, OR, USA
Geneviève B. Orr

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

van der Smagt, P., Hirzinger, G. (2012). Solving the Ill-Conditioning in Neural Network Learning. In: Montavon, G., Orr, G.B., Müller, KR. (eds) Neural Networks: Tricks of the Trade. Lecture Notes in Computer Science, vol 7700. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-35289-8_13

Download citation

DOI: https://doi.org/10.1007/978-3-642-35289-8_13
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-35288-1
Online ISBN: 978-3-642-35289-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics