Abstract
In this study we propose a discrete-time Hopfield Neural Network based clustering algorithm for text clustering for cases L = 2q where L is the number of clusters and q is a positive integer. The optimum general solution for even 2-cluster case is not known. The main contribution of this paper is as follows: We show that i) sum of intra-cluster distances which is to be minimized by a text clustering algorithm is equal to the Lyapunov (energy) function of the Hopfield Network whose weight matrix is equal to the Laplacian matrix obtained from the document-by-document distance matrix for 2-cluster case; and ii) the Hopfield Network can be iteratively applied to text clustering for L = 2k. Results of our experiments on several benchmark text datasets show the effectiveness of the proposed algorithm as compared to the k-means.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Jain, A.K., Murty, M.N., Flynn, P.J.: Data clustering: a review. ACM Comput. Surv. 31(3), 264–323 (1999)
Luxburg, U.V.: A Tutorial on Spectral Clustering. Technical Report TR-149. Max-Planck Institute for Biological Cybernetics (August 2006)
Kim, H., Lee, S.: An intelligent information system for organizing online text documents. Knowledge and Information Systems 6(2), 125–149 (2004)
Hinneburg, A., Keim, D.: A general approach to clustering in large databases with noise. Knowledge and Information Systems 5(4), 387–415 (2003)
Zhong, S., Ghosh, J.: Generative model-based document clustering: a comparative study. Knowledge and Information Systems 8, 374–384 (2005)
Zanasi, A.: Text Mining and its Applications to Intelligence. Crm and Knowledge Management (Advances in Management Information). WIT Press (2005)
Huang, A.: Similarity Measures for Text Document Clustering. In: NZCSRSC 2008, New Zealand (2008)
Ding, C.H.Q.: Data clustering: Principal components, Hopfield and self-aggregation networks. NERSC Division, Lawrence Berkeley National Lab., Univ. of California, Berkeley
Ding, C.H.Q.: Document retrieval and clustering: from principal component analysis to self-aggregation networks. Lawrence Berkeley National Laboratory, Berkeley, CA 94720
Han, J., Kamber, M.: Data Mining: Concepts and Techniques, 2nd edn. Morgan Kaufmann Publishers (2006)
Uykan, Z.: Spectral Based Solutions for (Near) Optimum Channel/Frequency Allocation. In: Proc. of IWSSIP 2011, Sarajevo, BiH (2011)
Luxburg, U.V., Belkin, M., Bousquet, O.: Consistency of spectral clustering. Annals of Statistics 36, 555–586 (2008)
Forman, G., Cohen, I.: Learning from Little: Comparison of Classifiers Given Little Training. In: Boulicaut, J.-F., Esposito, F., Giannotti, F., Pedreschi, D. (eds.) PKDD 2004. LNCS (LNAI), vol. 3202, pp. 161–172. Springer, Heidelberg (2004)
Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reutemann, P., Witten, I.H.: The WEKA Data Mining Software: An Update. SIGKDD Explorations 11, 10–18 (2009)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2012 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Uykan, Z., Ganiz, M.C., Şahinli, Ç. (2012). Discrete-Time Hopfield Neural Network Based Text Clustering Algorithm. In: Huang, T., Zeng, Z., Li, C., Leung, C.S. (eds) Neural Information Processing. ICONIP 2012. Lecture Notes in Computer Science, vol 7663. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-34475-6_66
Download citation
DOI: https://doi.org/10.1007/978-3-642-34475-6_66
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-34474-9
Online ISBN: 978-3-642-34475-6
eBook Packages: Computer ScienceComputer Science (R0)