Skip to main content
Log in

Efficient Minimisation of the KL Distance for the Approximation of Posterior Conditional Probabilities

  • Published:
Neural Processing Letters Aims and scope Submit manuscript

Abstract

The minimisation of a least mean squares cost function produces poor results in the ranges of the input variable where the quantity to be approximated takes on relatively low values. This can be a problem if an accurate approximation is required in a wide dynamic range. The present paper approaches this problem in the case of multilayer perceptrons trained to approximate the posterior conditional probabilities in a multicategory classification problem. The use of a cost function derived from the Kullback–Leibler information distance measure is proposed and a computationally light algorithm is derived for its minimisation. The effectiveness of the procedure is experimentally verified.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Similar content being viewed by others

References

  1. P. Burrascano, “A norm selection criterion for thegeneralized delta rule”, IEEE Trans. Neural Networks, Vol. 2, No. 1, pp. 125–130, 1991.

    Google Scholar 

  2. S.I. Amari, “Backpropagationand stochastic gradient descent method”, Neurocomputing, Vol. 5, Nos. 4–5, pp. 185–196, 1993.

    Google Scholar 

  3. F. Kanaya and S. Miyake,“Bayes statistical behavior and valid generalization of pattern classifying neural network”, IEEE Trans. Neural Networks, Vol. 2, No. 4, pp. 471–475, 1991.

    Google Scholar 

  4. S. Miyake and F. Kanaya, “A neuralnetwork approach to a Bayesian statistical decision problem”, IEEE Trans. Neural Networks, Vol. 2, No. 5, pp. 538–540, 1991.

    Google Scholar 

  5. D.W. Ruck, S.K. Rogers, M. Kabrisky, M.E. Oxley, B.W. Suter, “The multilayer perceptron as an approximation to a Bayes optimal discriminant function”, IEEE Trans. Neural Networks, Vol. 1, No. 4, pp. 296–298, 1990.

    Google Scholar 

  6. E.A. Wan, “Neural network classification:a Bayesian interpretation”, IEEE Trans. Neural Networks, Vol. 1, No. 4, pp. 303–305, 1990.

    Google Scholar 

  7. D.E. Rumelhart, G.E. Hinton and R.J. Williams, “Learning internal representations by error propagation”, in Parallel Distributed Processing: Explorations in the Microstructure of Cognition, Vol. 1, Ch. 8, pp. 318–362, MIT Press: Cambridge, MA, 1986.

    Google Scholar 

  8. S. Kullback, Information Theory and Statistics,Wiley: New York, 1959.

    Google Scholar 

  9. P. Burrascano and D. Pirollo,“Improved binary classification performance using an information theoretic criterion”, Neurocomputing, Vol. 13, Nos. 2–4, 1996.

Download references

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

About this article

Cite this article

Battisti, M., Burrascano, P. & Pirollo, D. Efficient Minimisation of the KL Distance for the Approximation of Posterior Conditional Probabilities. Neural Processing Letters 5, 47–55 (1997). https://doi.org/10.1023/A:1009605310499

Download citation

  • Issue Date:

  • DOI: https://doi.org/10.1023/A:1009605310499

Navigation