Abstract
In this paper we investigate the feed-forward learning problem. The well-known ill-conditioning which is present in most feed-forward learning problems is shown to be the result of the structure of the network. Also, the well-known problem that weights between ‘higher’ layers in the network have to settle before ‘lower’ weights can converge is addressed. We present a solution to these problems by modifying the structure of the network through the addition of linear connections which carry shared weights. We call the new network structure the linearly augmented feed-forward network, and it is shown that the universal approximation theorems are still valid. Simulation experiments show the validity of the new method, and demonstrate that the new network is less sensitive to local minima and learns faster than the original network.
Previously published in: Orr, G.B. and Müller, K.-R. (Eds.): LNCS 1524, ISBN 978-3-540-65311-0 (1998).
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Aarts, E., Korst, J.: Simulated Annealing and Boltzmann Machines. John Wiley & Sons (1989)
Battiti, R.: First- and second-order methods for learning: Between steepest descent and Newton’s method. Neural Computation 4, 141–166 (1992)
Cybenko, G.: Approximation by superpositions of a sigmoidal function. Mathematics of Control, Signals, and Systems 2(4), 303–314 (1989)
DasGupta, B., Siegelmann, H.T., Sontag, E.D.: On the complexity of training neural networks with continuous activation functions. IEEE Transactions on Neural Networks 6(6), 1490–1504 (1995)
Fahlman, S.E.: An empirical study of learning speed in back-propagation networks. Technical Report CMU–CS–88-0-162, Carnegie Mellon University (September 1988)
Funahashi, K.-I.: On the approximate realization of continuous mappings by neural networks. Neural Networks 2(3), 183–192 (1989)
Hochreiter, S., Schmidhuber, J.: Flat minima. Neural Computation 9(1), 1–42 (1997)
Hornik, K., Stinchcombe, M., White, H.: Multilayer feedforward networks are universal approximators. Neural Networks 2(5), 359–366 (1989)
Robbins, H., Monro, S.: A stochastic approximation method. Annals of Mathematical Statistics 22(1), 400–407 (1951)
Saarinen, S., Bramley, R., Cybenko, G.: Ill-conditioning in neural network training problems. Siam Journal of Scientific Computing 14(3), 693–714 (1993)
Schraudolph, N.N.: On centering neural network weight updates. Technical Report IDSIA-19-97, IDSIA (April 1997)
Sontag, E.D., Sussmann, H.J.: Backpropagation can give rise to spurious local minima even for networks without hidden layers. Complex Systems 3(1), 91–106 (1989)
Unnikrishnan, K.P., Venugopal, K.P.: Alopex: A correlation based learning algorithm for feedforward and recurrent neural networks. Neural Computation 6, 469–490 (1994)
van der Smagt, P.: Minimisation methods for training feed-forward networks. Neural Networks 7(1), 1–11 (1994)
van der Smagt, P.: Visual Robot Arm Guidance using Neural Networks. PhD thesis, Dept of Computer Systems, University of Amsterdam (March 1995)
Zhang, Q.J., Zhang, Y.J., Ye, W.: Local-sparse connection multilayer networks. In: Proc. IEEE Conf. Neural Networks, pp. 1254–1257. IEEE (1995)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2012 Springer-Verlag Berlin Heidelberg
About this chapter
Cite this chapter
van der Smagt, P., Hirzinger, G. (2012). Solving the Ill-Conditioning in Neural Network Learning. In: Montavon, G., Orr, G.B., Müller, KR. (eds) Neural Networks: Tricks of the Trade. Lecture Notes in Computer Science, vol 7700. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-35289-8_13
Download citation
DOI: https://doi.org/10.1007/978-3-642-35289-8_13
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-35288-1
Online ISBN: 978-3-642-35289-8
eBook Packages: Computer ScienceComputer Science (R0)