Skip to main content

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 1983))

  • 1316 Accesses

Abstract

This paper presents a method for determine an optimal set of components for a density mixture model using mutual information. A component with small mutual information is believed to be independent from the rest components and to make a significant contribution to the system and hence cannot be removed. Whilst a component with large mutual information is believed to be unlikely independent from the rest components within a system and hence can be removed. Continuing removing components with positive mutual information till the system mutual information becomes non-positive will finally give rise to a parsimonious structure for a density mixture model. The method has been verified with several examples.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Battiti R., Using mutual information for selecting features in supervised neural net learning, IEEE Trans. on Neural Networks 5 (1994) 537–550.

    Article  Google Scholar 

  2. Bichsel M. and Seitz P., Minimum class entropy: a maximum information approach to layered networks”, Neural Networks 2 (1989) 133–141.

    Article  Google Scholar 

  3. Bridle J.S., Training stochastic model recognition algorithms as networks can lead to mutual information estimation of parameters, Advances in Neural Information Processing Systems, ed. D. S. Touretzky 2 (1990) 211–217.

    Google Scholar 

  4. Fraser A. M. and Swinney H.L., Independent coordinates for strange attractors from mutual information, Physical Review A 33 (1986) 1134–1139.

    MathSciNet  Google Scholar 

  5. Fukunaga K. and Hayes R.R., The Reduced Parzen Classifier, IEEE Trans. on Pattern Analysis and Machine Intelligence 1 (1989) 423–425.

    Article  Google Scholar 

  6. Li W., Mutual information functions vs correlation functions, J. of Statistical Physics 60 (1990) 823–836.

    Article  MATH  Google Scholar 

  7. Linsker R., How to generate ordered maps by maximizing the mutual information between input and output signals, Neural Computation 1 (1989) 402–411.

    Article  Google Scholar 

  8. Parzen E., ” estimation of a probability density function and mode, Annals of Mathematical Statistics 33 (1962) 1065–1076.

    Article  MathSciNet  MATH  Google Scholar 

  9. Priebe C.E. and Marchette D.J., Adaptive mixture density estimation, Pattern Recognition 24 (1991) 1197–1209.

    Article  Google Scholar 

  10. Schioler H. and Hartmann U., Mapping neural network derived from the Parzen window estimator, Neural Networks 5 (1992) 903–909.

    Article  Google Scholar 

  11. Shannon C.E., The mathematical theory of communication, Bell Systems Technique J. 27 (1948) 379–423.

    MathSciNet  Google Scholar 

  12. Specht D., Probabilistic neural networks, Neural Networks 3 (1990) 109–118.

    Article  Google Scholar 

  13. Specht D., A general regression neural network, IEEE Trans. on Neural Networks 2 (1991) 568–576.

    Article  Google Scholar 

  14. Streit R.L. and Luginbuhl T.E., Maximum likelihood training of probabilistic neural networks, IEEE Trans. on Neural Networks 5 (1994) 764–783.

    Article  Google Scholar 

  15. Xu L. and Jordan M.I., On convergence properties of the EM algorithm for Gaussian mixtures, Neural Computation 8 (1996) 129–151.

    Article  Google Scholar 

  16. Yang Z.R. and Chen S., Robust maximum likelihood training of heteroscedastic probabilistic neural networks, Neural Networks 11 (1998) 739–747.

    Article  Google Scholar 

  17. Young T.Y. and Coraluppi G., Stochastic estimation of a mixture of normal density functions using an information criterion, IEEE Trans. On Information Theory 16 (1970) 258–263.

    Article  MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2000 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Yangy, Z.R., Zwolinskiz, M. (2000). Applying Mutual Information to Adaptive Mixture Models. In: Leung, K.S., Chan, LW., Meng, H. (eds) Intelligent Data Engineering and Automated Learning — IDEAL 2000. Data Mining, Financial Engineering, and Intelligent Agents. IDEAL 2000. Lecture Notes in Computer Science, vol 1983. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-44491-2_35

Download citation

  • DOI: https://doi.org/10.1007/3-540-44491-2_35

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-41450-6

  • Online ISBN: 978-3-540-44491-6

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics