Abstract
In this paper, we propose a new method for linear feature extraction and dimensionality reduction for classification problems. The method is based on the maximization of the Mutual Information (MI) between the resulting features and the classes. A Gaussian Mixture is used for modelling the distribution of the data. By means of this model, the entropy of the data is then estimated, and so the MI at the output. A gradient descent algorithm is provided for its optimization. Some experiments are provided in which the method is compared with other popular linear feature extractors.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
UCI Repository of Machine Learning Databases (1998), http://www.ics.uci.edu/~mlearn/MLRepository.html
CBCL Software and Datasets, MIT, Face Images database (2000), http://www.ai.mit.edu/projects/cbcl/software-datasets/index.html
Battiti, R.: Using mutual information for selecting features in supervised neural net learning. Neural Networks 5, 537–550 (1994)
Bell, A.J., Sejnowski, T.: An information maximisation approach to blind separation and blind deconvolution. Neural Computation 7(6), 1004–1034 (1995)
Center, J.L.: Blind source separation, independent component analysis, and pattern classification - connections and synergies. In: Proceedings MaxEnt 23, Jackson Hole, WY (2003)
Cover, T.M., Thomas, J.A.: Elements of Information Theory. John Wiley &Sons, Chichester (1991)
Dempster, A.P., Laird, N.M., Rubin, D.B.: Maximum likelihood from incomplete data via EM algorithm (with discussion). Journal of the Royal Statistical Society B(39), 1–38 (1977)
Xu, D., Principe, J., Fischer III., J.W.: Information-Theoretic Learning, vol. 1. Wiley, Chichester (2000)
Kaski, S., Peltonen, J.: Informative discriminant analysis. In: Proceeding of the ICML, Washington DC, vol. 5, pp. 329–336 (2003)
Kwak, N., Choi, C.: Input feature selection by mutual information based on parzen window. IEEE Transactions on Pattern Analysis and Machine Intelligence 24(12), 1667–1671 (2002)
Pereira, F.C., Tishby, N., Bialek, W.: The information bottleneck method. In: 37th Annual Allerton International Conference on Communications, Control and Computing (1999)
Torkkola, K.: Feature extraction by non-parametric mutual information maximization. Journal on Machine Learning Research 3, 1415–1438 (2003)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2004 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Leiva-Murillo, J.M., Artés-Rodríguez, A. (2004). A Gaussian Mixture Based Maximization of Mutual Information for Supervised Feature Extraction. In: Puntonet, C.G., Prieto, A. (eds) Independent Component Analysis and Blind Signal Separation. ICA 2004. Lecture Notes in Computer Science, vol 3195. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-30110-3_35
Download citation
DOI: https://doi.org/10.1007/978-3-540-30110-3_35
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-23056-4
Online ISBN: 978-3-540-30110-3
eBook Packages: Springer Book Archive