Iterative weighted least squares algorithms for neural networks classifiers

Kurita, Takio

doi:10.1007/3-540-57369-0_29

Takio Kurita¹

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 743))

Included in the following conference series:

International Workshop on Algorithmic Learning Theory

163 Accesses
5 Citations

Abstract

This paper discusses learning algorithms of layered neural networks from the standpoint of maximum likelihood estimation. Fisher information is explicitly calculated for the network with only one neuron. It can be interpreted as a weighted covariance matrix of input vectors. A learning algorithm is presented on the basis of Fisher's scoring method. It is shown that the algorithm can be interpreted as iterations of weighted least square method. Then those results are extended to the layered network with one hidden layer. It is also shown that Fisher information is given as a weighted covariance matrix of inputs and outputs of hidden units for this network. Tow new algorithms are proposed by utilizing this information. It is experimentally shown that the algorithms converge with fewer iterations than usual BP algorithm. Especially UFS (unitwise Fisher's scoring) method reduces to the algorithm in which each unit estimates its own weights by a weighted least squares method.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Rumelhart,D.E., Hinton,G.E., and Williams,R.J.: Learning representations by back-propagating errors, Nature, Vol.323-9, pp.533–536 (1986).
Google Scholar
Rumelhart,D.E., Hinton,G.E., and Williams,R.J.: Learning internal representations by error propagation, in Parallel Distributed Processing Volume 1, McCleland, J.L., Rumelhart, D.E., and The PDP Research group, Cambridge, MA: MIT Press, 1986.
Google Scholar
Richard,M.D. and Lippmann,R.P.: Neural network classifiers estimate Bayesian a posteriori probabilities, Neural Computation, Vol.3, No.4, pp.461–483 (1991).
Google Scholar
Baum,E.B. and Wilczek.F.: Supervised learning of probability distributions by neural networks, In Neural Information Processing Systems, D. Anderson, ed.,pp.52–61. American Institute of Physics, New York (1988).
Google Scholar
Hinton,G.E.: Connectionist learning procedures, Artificial Intelligence 40, 185–234 (1989).
Google Scholar
Bridle,J.S.:Training stochastic model recognition algorithms as networks can lead to maximum mutual information estimation of parameters. In Neural Information Processing Systems 2, David S.Touretzky, ed., pp.211–217, Morgan Kaufmann (1990).
Google Scholar
Gish,H.: A probabilistic approach to the understanding and training of neural network classifiers. In Proceedings of IEEE Conference on Acoustics Speech and Signal Processing, pp.1361–1364 (1990).
Google Scholar
Hampshire, J.B. and Waibel, A.H.: A novel objective function for improved phoneme recognition using time-delay neural networks, IEEE Trans. on Neural Networks, Vol.1, No.2, pp.216–228 (1990).
Google Scholar
Holt,M.J.J. and Semanani,S.: Convergence of back propagation in neural networks using a log likelihood cost function, Electronics Letters, Vol.26, No.23 (1990).
Google Scholar
Kurita,T.: A Method to Determine the Number of Hidden Units of Three Layered Neural Networks by Information Criteria, Trans. of IEICE Japan, J73-D-II, 1872–1878, 1990 (in Japanese).
Google Scholar
Kurita,T.: On Maximum Likelihood Estimation of Feed-Forward Neural Net Parameters, IEICE Tech. Report, NC91-36, 1991 (in Japanese).
Google Scholar
McCullagh,P. and Nelder FRS,J.A.: Generalized Linear Models, Chapman and Hall, 1989.
Google Scholar

Download references

Author information

Authors and Affiliations

Electrotechnical Laboratory, 1-1-4 Umezono, 305, Tsukuba, Japan
Takio Kurita

Authors

Takio Kurita
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Shuji Doshita Koichi Furukawa Klaus P. Jantke Toyaki Nishida

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Kurita, T. (1993). Iterative weighted least squares algorithms for neural networks classifiers. In: Doshita, S., Furukawa, K., Jantke, K.P., Nishida, T. (eds) Algorithmic Learning Theory. ALT 1992. Lecture Notes in Computer Science, vol 743. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-57369-0_29

Download citation

DOI: https://doi.org/10.1007/3-540-57369-0_29
Published: 31 May 2005
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-57369-2
Online ISBN: 978-3-540-48093-8
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics