Abstract
This paper proposes an algorithm for clustering using an information-theoretic based criterion. The cross entropy between elements in different clusters is used as a measure of quality of the partition. The proposed algorithm uses “classical” clustering algorithms to initialize some small regions (auxiliary clusters) that will be merged to construct the final clusters. The algorithm was tested using several databases with different spatial distributions.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Duda, R., Hart, P., Stork, D.: Pattern Classification. Wiley, Chichester (2001)
Martins, A.M., Neto, A.D.D., Costa, J.D., Costa, J.A.F.: Clustering using neural networks and kullback-leibler divergency. In: Proc. of IEEE International Joint Conference on Neural Networks, vol. 4, pp. 2813–2817 (2004)
Rao, S., de Medeiros Martins, A., Príncipe, J.C.: Mean shift: An information theoretic perspective. Pattern Recogn. Lett. 30(3), 222–230 (2009)
Principe, J.C.: Information theoretic learning, vol. 7. John Wiley, Chichester (2000)
Principe, J.C., Xu, D.: Information-theoretic learning using renyi’s quadratic entropy. In: Proceedings of the First International Workshop on Independent Component Analysis and Signal Separation, Aussois, pp. 407–412 (1999)
Kullback, S., Leibler, R.A.: On information and sufficiency. The Annals of Mathematical Statistics 22(1), 79–86 (1951)
Shannon, C.E.: A mathematical theory of communication. Bell System Technical Journal 27, 379–423, 625–656 (1948)
Cover, T.M., Thomas, J.A.: Elements of Information Theory, 2nd edn. John Wiley, Chichester (1991)
Parzen, E.: On the estimation of a probability density function and the mode. Annals of Mathematical Statistics (33), 1065–1076 (1962)
Gokcay, E., Principe, J.C.: Information theoretic clustering. IEEE Trans. Pattern Anal. Mach. Intell. 24(2), 158–171 (2002)
Witten, I.H., Frank, E.: Data Mining: Practical Machine Learning Tools and Techniques, 2nd edn. Morgan Kaufmann, San Francisco (2005)
Hair, J.F. (ed.): Multivariate data analysis, 6th edn. Pearson/Prentice Hall, Upper Saddle River, NJ (2006)
Golub, T.R., Slonim, D.K., Tamayo, P., Huard, C., Gaasenbeek, M., Mesirov, J.P., Coller, H., Loh, M.L., Downing, J.R., Caligiuri, M.A., Bloomfield, C.D., Lander, E.S.: Molecular classification of cancer: class discovery and class prediction by gene expression monitoring. Science 286(5439), 531–537 (1999)
Jain, A.K., Dubes, R.C.: Algorithms for clustering data. Prentice-Hall, Inc., Upper Saddle River (1988)
Dhillon, I., Mallela, S., Modha, D.: Information-theoretic co-clustering. In: Grnwald, P. (ed.) ACM SIGKDD, The Minimum Description Length Principle (2003)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2010 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
de Araújo, D., Neto, A.D., Melo, J., Martins, A. (2010). Clustering Using Elements of Information Theory. In: Diamantaras, K., Duch, W., Iliadis, L.S. (eds) Artificial Neural Networks – ICANN 2010. ICANN 2010. Lecture Notes in Computer Science, vol 6354. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-15825-4_52
Download citation
DOI: https://doi.org/10.1007/978-3-642-15825-4_52
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-15824-7
Online ISBN: 978-3-642-15825-4
eBook Packages: Computer ScienceComputer Science (R0)