Skip to main content

Iterative weighted least squares algorithms for neural networks classifiers

  • Technical Papers
  • Conference paper
  • First Online:
Algorithmic Learning Theory (ALT 1992)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 743))

Included in the following conference series:

Abstract

This paper discusses learning algorithms of layered neural networks from the standpoint of maximum likelihood estimation. Fisher information is explicitly calculated for the network with only one neuron. It can be interpreted as a weighted covariance matrix of input vectors. A learning algorithm is presented on the basis of Fisher's scoring method. It is shown that the algorithm can be interpreted as iterations of weighted least square method. Then those results are extended to the layered network with one hidden layer. It is also shown that Fisher information is given as a weighted covariance matrix of inputs and outputs of hidden units for this network. Tow new algorithms are proposed by utilizing this information. It is experimentally shown that the algorithms converge with fewer iterations than usual BP algorithm. Especially UFS (unitwise Fisher's scoring) method reduces to the algorithm in which each unit estimates its own weights by a weighted least squares method.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Rumelhart,D.E., Hinton,G.E., and Williams,R.J.: Learning representations by back-propagating errors, Nature, Vol.323-9, pp.533–536 (1986).

    Google Scholar 

  2. Rumelhart,D.E., Hinton,G.E., and Williams,R.J.: Learning internal representations by error propagation, in Parallel Distributed Processing Volume 1, McCleland, J.L., Rumelhart, D.E., and The PDP Research group, Cambridge, MA: MIT Press, 1986.

    Google Scholar 

  3. Richard,M.D. and Lippmann,R.P.: Neural network classifiers estimate Bayesian a posteriori probabilities, Neural Computation, Vol.3, No.4, pp.461–483 (1991).

    Google Scholar 

  4. Baum,E.B. and Wilczek.F.: Supervised learning of probability distributions by neural networks, In Neural Information Processing Systems, D. Anderson, ed.,pp.52–61. American Institute of Physics, New York (1988).

    Google Scholar 

  5. Hinton,G.E.: Connectionist learning procedures, Artificial Intelligence 40, 185–234 (1989).

    Google Scholar 

  6. Bridle,J.S.:Training stochastic model recognition algorithms as networks can lead to maximum mutual information estimation of parameters. In Neural Information Processing Systems 2, David S.Touretzky, ed., pp.211–217, Morgan Kaufmann (1990).

    Google Scholar 

  7. Gish,H.: A probabilistic approach to the understanding and training of neural network classifiers. In Proceedings of IEEE Conference on Acoustics Speech and Signal Processing, pp.1361–1364 (1990).

    Google Scholar 

  8. Hampshire, J.B. and Waibel, A.H.: A novel objective function for improved phoneme recognition using time-delay neural networks, IEEE Trans. on Neural Networks, Vol.1, No.2, pp.216–228 (1990).

    Google Scholar 

  9. Holt,M.J.J. and Semanani,S.: Convergence of back propagation in neural networks using a log likelihood cost function, Electronics Letters, Vol.26, No.23 (1990).

    Google Scholar 

  10. Kurita,T.: A Method to Determine the Number of Hidden Units of Three Layered Neural Networks by Information Criteria, Trans. of IEICE Japan, J73-D-II, 1872–1878, 1990 (in Japanese).

    Google Scholar 

  11. Kurita,T.: On Maximum Likelihood Estimation of Feed-Forward Neural Net Parameters, IEICE Tech. Report, NC91-36, 1991 (in Japanese).

    Google Scholar 

  12. McCullagh,P. and Nelder FRS,J.A.: Generalized Linear Models, Chapman and Hall, 1989.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Shuji Doshita Koichi Furukawa Klaus P. Jantke Toyaki Nishida

Rights and permissions

Reprints and permissions

Copyright information

© 1993 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Kurita, T. (1993). Iterative weighted least squares algorithms for neural networks classifiers. In: Doshita, S., Furukawa, K., Jantke, K.P., Nishida, T. (eds) Algorithmic Learning Theory. ALT 1992. Lecture Notes in Computer Science, vol 743. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-57369-0_29

Download citation

  • DOI: https://doi.org/10.1007/3-540-57369-0_29

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-57369-2

  • Online ISBN: 978-3-540-48093-8

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics