Applying Mutual Information to Adaptive Mixture Models

Yangy, Zheng Rong; Zwolinskiz, Mark

doi:10.1007/3-540-44491-2_35

Zheng Rong Yangy⁶ &
Mark Zwolinskiz⁷

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 1983))

Included in the following conference series:

International Conference on Intelligent Data Engineering and Automated Learning

1316 Accesses

Abstract

This paper presents a method for determine an optimal set of components for a density mixture model using mutual information. A component with small mutual information is believed to be independent from the rest components and to make a significant contribution to the system and hence cannot be removed. Whilst a component with large mutual information is believed to be unlikely independent from the rest components within a system and hence can be removed. Continuing removing components with positive mutual information till the system mutual information becomes non-positive will finally give rise to a parsimonious structure for a density mixture model. The method has been verified with several examples.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Battiti R., Using mutual information for selecting features in supervised neural net learning, IEEE Trans. on Neural Networks 5 (1994) 537–550.
Article Google Scholar
Bichsel M. and Seitz P., Minimum class entropy: a maximum information approach to layered networks”, Neural Networks 2 (1989) 133–141.
Article Google Scholar
Bridle J.S., Training stochastic model recognition algorithms as networks can lead to mutual information estimation of parameters, Advances in Neural Information Processing Systems, ed. D. S. Touretzky 2 (1990) 211–217.
Google Scholar
Fraser A. M. and Swinney H.L., Independent coordinates for strange attractors from mutual information, Physical Review A 33 (1986) 1134–1139.
MathSciNet Google Scholar
Fukunaga K. and Hayes R.R., The Reduced Parzen Classifier, IEEE Trans. on Pattern Analysis and Machine Intelligence 1 (1989) 423–425.
Article Google Scholar
Li W., Mutual information functions vs correlation functions, J. of Statistical Physics 60 (1990) 823–836.
Article MATH Google Scholar
Linsker R., How to generate ordered maps by maximizing the mutual information between input and output signals, Neural Computation 1 (1989) 402–411.
Article Google Scholar
Parzen E., ” estimation of a probability density function and mode, Annals of Mathematical Statistics 33 (1962) 1065–1076.
Article MathSciNet MATH Google Scholar
Priebe C.E. and Marchette D.J., Adaptive mixture density estimation, Pattern Recognition 24 (1991) 1197–1209.
Article Google Scholar
Schioler H. and Hartmann U., Mapping neural network derived from the Parzen window estimator, Neural Networks 5 (1992) 903–909.
Article Google Scholar
Shannon C.E., The mathematical theory of communication, Bell Systems Technique J. 27 (1948) 379–423.
MathSciNet Google Scholar
Specht D., Probabilistic neural networks, Neural Networks 3 (1990) 109–118.
Article Google Scholar
Specht D., A general regression neural network, IEEE Trans. on Neural Networks 2 (1991) 568–576.
Article Google Scholar
Streit R.L. and Luginbuhl T.E., Maximum likelihood training of probabilistic neural networks, IEEE Trans. on Neural Networks 5 (1994) 764–783.
Article Google Scholar
Xu L. and Jordan M.I., On convergence properties of the EM algorithm for Gaussian mixtures, Neural Computation 8 (1996) 129–151.
Article Google Scholar
Yang Z.R. and Chen S., Robust maximum likelihood training of heteroscedastic probabilistic neural networks, Neural Networks 11 (1998) 739–747.
Article Google Scholar
Young T.Y. and Coraluppi G., Stochastic estimation of a mixture of normal density functions using an information criterion, IEEE Trans. On Information Theory 16 (1970) 258–263.
Article MATH Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Science, Exeter University, EX4 4PT, Exeter, UK
Zheng Rong Yangy
Department of Computer Science and Electronics, Southampton University, SO17 1BJ, Southampton, UK
Mark Zwolinskiz

Authors

Zheng Rong Yangy
View author publications
You can also search for this author in PubMed Google Scholar
Mark Zwolinskiz
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Computer Science and Engineering, The Chinese University of Hong Kong, Shatin, Hong Kong
Kwong Sak Leung & Lai-Wan Chan &
Department of Systems Engineering and Engineering Management, The Chinese University of Hong Kong, Shatin, Hong Kong
Helen Meng

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Yangy, Z.R., Zwolinskiz, M. (2000). Applying Mutual Information to Adaptive Mixture Models. In: Leung, K.S., Chan, LW., Meng, H. (eds) Intelligent Data Engineering and Automated Learning — IDEAL 2000. Data Mining, Financial Engineering, and Intelligent Agents. IDEAL 2000. Lecture Notes in Computer Science, vol 1983. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-44491-2_35

Download citation

DOI: https://doi.org/10.1007/3-540-44491-2_35
Published: 27 May 2002
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-41450-6
Online ISBN: 978-3-540-44491-6
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics