Abstract
This paper presents a method for determine an optimal set of components for a density mixture model using mutual information. A component with small mutual information is believed to be independent from the rest components and to make a significant contribution to the system and hence cannot be removed. Whilst a component with large mutual information is believed to be unlikely independent from the rest components within a system and hence can be removed. Continuing removing components with positive mutual information till the system mutual information becomes non-positive will finally give rise to a parsimonious structure for a density mixture model. The method has been verified with several examples.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Battiti R., Using mutual information for selecting features in supervised neural net learning, IEEE Trans. on Neural Networks 5 (1994) 537–550.
Bichsel M. and Seitz P., Minimum class entropy: a maximum information approach to layered networks”, Neural Networks 2 (1989) 133–141.
Bridle J.S., Training stochastic model recognition algorithms as networks can lead to mutual information estimation of parameters, Advances in Neural Information Processing Systems, ed. D. S. Touretzky 2 (1990) 211–217.
Fraser A. M. and Swinney H.L., Independent coordinates for strange attractors from mutual information, Physical Review A 33 (1986) 1134–1139.
Fukunaga K. and Hayes R.R., The Reduced Parzen Classifier, IEEE Trans. on Pattern Analysis and Machine Intelligence 1 (1989) 423–425.
Li W., Mutual information functions vs correlation functions, J. of Statistical Physics 60 (1990) 823–836.
Linsker R., How to generate ordered maps by maximizing the mutual information between input and output signals, Neural Computation 1 (1989) 402–411.
Parzen E., ” estimation of a probability density function and mode, Annals of Mathematical Statistics 33 (1962) 1065–1076.
Priebe C.E. and Marchette D.J., Adaptive mixture density estimation, Pattern Recognition 24 (1991) 1197–1209.
Schioler H. and Hartmann U., Mapping neural network derived from the Parzen window estimator, Neural Networks 5 (1992) 903–909.
Shannon C.E., The mathematical theory of communication, Bell Systems Technique J. 27 (1948) 379–423.
Specht D., Probabilistic neural networks, Neural Networks 3 (1990) 109–118.
Specht D., A general regression neural network, IEEE Trans. on Neural Networks 2 (1991) 568–576.
Streit R.L. and Luginbuhl T.E., Maximum likelihood training of probabilistic neural networks, IEEE Trans. on Neural Networks 5 (1994) 764–783.
Xu L. and Jordan M.I., On convergence properties of the EM algorithm for Gaussian mixtures, Neural Computation 8 (1996) 129–151.
Yang Z.R. and Chen S., Robust maximum likelihood training of heteroscedastic probabilistic neural networks, Neural Networks 11 (1998) 739–747.
Young T.Y. and Coraluppi G., Stochastic estimation of a mixture of normal density functions using an information criterion, IEEE Trans. On Information Theory 16 (1970) 258–263.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2000 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Yangy, Z.R., Zwolinskiz, M. (2000). Applying Mutual Information to Adaptive Mixture Models. In: Leung, K.S., Chan, LW., Meng, H. (eds) Intelligent Data Engineering and Automated Learning — IDEAL 2000. Data Mining, Financial Engineering, and Intelligent Agents. IDEAL 2000. Lecture Notes in Computer Science, vol 1983. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-44491-2_35
Download citation
DOI: https://doi.org/10.1007/3-540-44491-2_35
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-41450-6
Online ISBN: 978-3-540-44491-6
eBook Packages: Springer Book Archive