Abstract
The previous chapters provided extensive coverage of the error entropy criterion (EEC) especially in regard to minimization of the error entropy (MEE) for linear and nonlinear filtering (or regression) applications. However, the spectrum of engineering applications of adaptive systems is much broader than filtering or regression. Even looking at the subclass of supervised applications we have yet to deal with classification, which is an important application area for learning technologies. All of the practical ingredients are here to extend EEC to classification inasmuch as Chapter 5 covered the integration of EEC with the backpropagation algorithm (MEE-BP). Hence we have all the tools needed to train classifiers with MEE. We show that indeed this is the case and that the classifiers trained with MEE have performances normally better than MSE-trained classifiers. However, there are still no mathematical foundations to ascertain under what conditions EEC is optimal for classification, and further work is necessary.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Aczél J., Daróczy Z., On measures of information and their characterizations, Mathematics in Science and Engineering, vol. 115, Academic Press, New York, 1975.
Ahmad I., Lin P., A nonparametric estimation of the entropy for absolutely continuous distributions, IEEE Trans. on Inf. Theor., 22:372–375, 1976.
Battiti R., Using mutual information for selecting features in supervised neural net learning, IEEE Trans. Neural Netw., 5(4):537–550, July 1994.
Biem A., Katagiri S., Juang B., Pattern recognition using discriminative feature extraction, IEEE Trans. Signal Process., 45(2):500–504, Feb. 1997.
Bishop C., Neural Networks for Pattern Recognition, Clarendon Press, Oxford, 1995.
Cover T., Thomas J., Elements of Information Theory, Wiley, New York, 1991
Deco G., Obradovic D., An Information-Theoretic Approach to Neural Computing, Springer, New York, 1996.
Duda R., Hart P., Stork D., Pattern Classification and Scene Analysis. John Wiley & Sons, New York, 2nd edition, 2001.
Erdogmus D., Information theoretic learning: Renyi’s entropy and its applications to adaptive systems training, Ph.D. Dissertation, University of Florida, Gainesville, 2002.
Erdogmus D., Principe J., Lower and upper bounds for misclassification probability based on Renyi’s information, J. VLSI Signal Process. Syst., 37(2/3):305–317, 2004.
Fano R., Transmission of Information: A Statistical Theory of Communications, MIT Press, New York, 1961.
Feder M., Merhav N., Relations between entropy and error probability, IEEE Trans. Inf. Theor., 40(1); 259–266, 1994.
Fisher R., The use of multiple measurements in taxonomic problems, Ann. Eugenics 7; 170–188, Wiley, New York, 1950.
Fukunaga K., An Introduction to Statistical Pattern Recognition, Academic Press, New York, 1972
Han T., Verdu S., Generalizing the Fano inequality, IEEE Trans. Inf. Theor., 40(4):1247–1251, 1994.
Hertz J., Krogh A., and Palmer R., Introduction to the Theory of Neural Computation, Addison Wesley, Readings, MA, 1991.
Hild II K., Blind source separation of convolutive mixtures using Renyi’s divergence, University of Florida, Gainesville, Fall 2003.
Hild II K., Erdogmus D., Torkkola K., and Principe J., Feature extraction using information-theoretic learning, IEEE Trans. Pat. Anal. Mach. Intell. 28(9):1385–1392, 2006
LeCun Y., Bottou L., Bengio Y., Haffner P., Gradient-based learning applied to document recognition, Proc. IEEE, 86(11):2278–2324, Nov. 1998.
Morejon R., An information theoretic approach to sonar automatic target recognition, Ph.D. dissertation, University of Florida, Spring 2003
Principe J., Euliano N., Lefebvre C., Neural Systems: Fundamentals through Simulations, CD-ROM textbook, John Wiley, New York, 2000.
Principe J., Xu D., Zhao Q., Fisher J. Learning from examples with information theoretic criteria, VLSI Signal Process. Syst., 26:61–77, 2001.
Ripley B., Pattern Recognition and Neural Networks, Cambridge University Press, New York, 1996
Santos J., Alexandre L., Sa J., The error entropy minimization algorithm for neural network classification, in A. Lofti (Ed.), Int Conf. Recent Advances in Soft Computing, pp. 92–97, 2004.
Silva L., Felgueiras C., Alexandre L., Sa J., Error entropy in classification problems: a univariate data analysis, Neural comput., 18(9):2036–2061, 2006.
Silva L., Neural networks with error density risk functionals for data classification, Ph.D. Thesis, Faculdade de Engenharia, University of Porto, 2008.
Stoller D., Univariate two population distribution free discrimination, J. Amer. Stat. Assoc. 49: 770–777, 1954.
Torkkola K., Visualizing class structure in data using mutual information, Proceedings of NNSP X, pp. 376–385, Sydney, Australia, 2000
Vapnik V., The Nature of Statistical Learning Theory, Springer-Verlag, New York, 1995
Velten V., Ross T., Mossing J., Worrell S., Bryant M., Standard SAR ATR evaluation experiments using the MSTAR public release data set, research report, Wright State University, 1998.
Wittner B. Denker J., Strategies for teaching layered networks classification tasks, in Neural Inf. Proc. Syst. (Ed Anderson), 850–859, Ame. Inst. Phys. 1987.
Xu D., Energy, Entropy and Information Potential for Neural Computation, PhD Dissertation, University of Florida, Gainesville, 1999
Zhao Q., Principe J., Brennan V., Xu D., Wang Z., Synthetic aperture radar automatic target recognition with three strategies of learning and representation, Opt. Eng., 39(5):1230–1244, 2000.
Author information
Authors and Affiliations
Rights and permissions
Copyright information
© 2010 Springer Science+Business Media, LLC
About this chapter
Cite this chapter
Erdogmus, D., Xu, D., Hild, K. (2010). Classification with EEC, Divergence Measures, and Error Bounds. In: Information Theoretic Learning. Information Science and Statistics. Springer, New York, NY. https://doi.org/10.1007/978-1-4419-1570-2_6
Download citation
DOI: https://doi.org/10.1007/978-1-4419-1570-2_6
Published:
Publisher Name: Springer, New York, NY
Print ISBN: 978-1-4419-1569-6
Online ISBN: 978-1-4419-1570-2
eBook Packages: Computer ScienceComputer Science (R0)