Skip to main content

Solving the Ill-Conditioning in Neural Network Learning

  • Chapter

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 7700))

Abstract

In this paper we investigate the feed-forward learning problem. The well-known ill-conditioning which is present in most feed-forward learning problems is shown to be the result of the structure of the network. Also, the well-known problem that weights between ‘higher’ layers in the network have to settle before ‘lower’ weights can converge is addressed. We present a solution to these problems by modifying the structure of the network through the addition of linear connections which carry shared weights. We call the new network structure the linearly augmented feed-forward network, and it is shown that the universal approximation theorems are still valid. Simulation experiments show the validity of the new method, and demonstrate that the new network is less sensitive to local minima and learns faster than the original network.

Previously published in: Orr, G.B. and Müller, K.-R. (Eds.): LNCS 1524, ISBN 978-3-540-65311-0 (1998).

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   109.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   139.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Aarts, E., Korst, J.: Simulated Annealing and Boltzmann Machines. John Wiley & Sons (1989)

    Google Scholar 

  2. Battiti, R.: First- and second-order methods for learning: Between steepest descent and Newton’s method. Neural Computation 4, 141–166 (1992)

    Article  Google Scholar 

  3. Cybenko, G.: Approximation by superpositions of a sigmoidal function. Mathematics of Control, Signals, and Systems 2(4), 303–314 (1989)

    Article  MathSciNet  MATH  Google Scholar 

  4. DasGupta, B., Siegelmann, H.T., Sontag, E.D.: On the complexity of training neural networks with continuous activation functions. IEEE Transactions on Neural Networks 6(6), 1490–1504 (1995)

    Article  Google Scholar 

  5. Fahlman, S.E.: An empirical study of learning speed in back-propagation networks. Technical Report CMU–CS–88-0-162, Carnegie Mellon University (September 1988)

    Google Scholar 

  6. Funahashi, K.-I.: On the approximate realization of continuous mappings by neural networks. Neural Networks 2(3), 183–192 (1989)

    Article  Google Scholar 

  7. Hochreiter, S., Schmidhuber, J.: Flat minima. Neural Computation 9(1), 1–42 (1997)

    Article  MATH  Google Scholar 

  8. Hornik, K., Stinchcombe, M., White, H.: Multilayer feedforward networks are universal approximators. Neural Networks 2(5), 359–366 (1989)

    Article  Google Scholar 

  9. Robbins, H., Monro, S.: A stochastic approximation method. Annals of Mathematical Statistics 22(1), 400–407 (1951)

    Article  MathSciNet  MATH  Google Scholar 

  10. Saarinen, S., Bramley, R., Cybenko, G.: Ill-conditioning in neural network training problems. Siam Journal of Scientific Computing 14(3), 693–714 (1993)

    Article  MathSciNet  MATH  Google Scholar 

  11. Schraudolph, N.N.: On centering neural network weight updates. Technical Report IDSIA-19-97, IDSIA (April 1997)

    Google Scholar 

  12. Sontag, E.D., Sussmann, H.J.: Backpropagation can give rise to spurious local minima even for networks without hidden layers. Complex Systems 3(1), 91–106 (1989)

    MathSciNet  MATH  Google Scholar 

  13. Unnikrishnan, K.P., Venugopal, K.P.: Alopex: A correlation based learning algorithm for feedforward and recurrent neural networks. Neural Computation 6, 469–490 (1994)

    Article  Google Scholar 

  14. van der Smagt, P.: Minimisation methods for training feed-forward networks. Neural Networks 7(1), 1–11 (1994)

    Article  Google Scholar 

  15. van der Smagt, P.: Visual Robot Arm Guidance using Neural Networks. PhD thesis, Dept of Computer Systems, University of Amsterdam (March 1995)

    Google Scholar 

  16. Zhang, Q.J., Zhang, Y.J., Ye, W.: Local-sparse connection multilayer networks. In: Proc. IEEE Conf. Neural Networks, pp. 1254–1257. IEEE (1995)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2012 Springer-Verlag Berlin Heidelberg

About this chapter

Cite this chapter

van der Smagt, P., Hirzinger, G. (2012). Solving the Ill-Conditioning in Neural Network Learning. In: Montavon, G., Orr, G.B., Müller, KR. (eds) Neural Networks: Tricks of the Trade. Lecture Notes in Computer Science, vol 7700. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-35289-8_13

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-35289-8_13

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-35288-1

  • Online ISBN: 978-3-642-35289-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics