Abstract
Automatic clustering of documents is a task that has become increasingly important with the explosion of online information. The Self Organising Map (SOM) has been used to cluster documents effectively, but efforts to date have used a single or a series of 2-dimensional maps. Ideally, the output of a document-clustering algorithm should be easy for a user to interpret. This paper describes a method of clustering documents using a series of 1-dimensional SOM arranged hierarchically to provide an intuitive tree structure representing document clusters. Wordnet is used to find the base forms of words and only cluster on words that can be nouns.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Blair, D.C., Maron M.E.: 1985. An evaluation of retrieval effectiveness for a full-text document-retrieval system. Communications of the ACM, 28 (1985)
van Rijsbergen, C., Information Retrieval, (1979)
Furnas, G.W., Landauer, T.K., Gomez, L.M., Dumais, S.T.: The vocabulary problem in human-system communication. Communications of the ACM, 30(11):964–971, November (1987).
Merkl, D., Exploration of Text Collections with Hierarchical Feature Maps (1997)
Rauber, A., Dittenbach, M., and Merkl, D., Automatically Detecting and Organizing Documents into Topic Hierarchies: A Neural Network Based Approach to Bookshelf Creation and Arrangement (2000)
Kohonen, T., Self-organized formation of topologically correct feature maps. Biological Cybernetics, 43:-69, 1982.
Krista, L., Honkela, T., Kaski, S., and Kohonen, T., WEBSOM-A Status Report (1996)
Honkela, T., Pulkki, V., and Kohonen, T. (1995). Contextual relations of words in Grimm tales analyzed by self-organizing map. In Fogelman-Soulié, F. and Gallinari, P., editors, Proceedings of the International Conference on Artificial Neural Networks, ICANN-95, volume 2, pages 3–7, Paris. EC2 et Cie.
Kohonen, T., Kasaki., S., Langus., K., Salojärvi, J., Paatero., V. and Saarela, A. Self Organization of a Massive Document Collection. IEEE Transactions on Neural Networks for Data Mining and Knowledge Descovery, Volume 11(3), pp 574–585. (2000)
Blackmore, J., Miikkulainen, R.: Incremental grid growing: Encoding high-dimensional structure into a two-dimensional feature map. In Proc Int’l Conf Neural Networks (ICANN’93), San Francisco, CA, 1993.
Fritzke, B.: Growing grid-a self-organizing network with constant neighborhood range and adaption strength. Neural Processing Letters, 2, No. 5:1–5, (1995)
Chen, H., Houston., A., Sewell, R., Scatz., B., Internet Browsing and Searching: User Evaluations of Category Map and Concept Space Techniques (1998)
Salton, G., Wong, A., and Yang, C., Vector space model for automatic indexing, Communications of the ACM 18, pp. 613–620, 1975.
Rauber, A., Merkl, D., Automatic Labeling of Self-Organizing Maps: Making a Treasure-Map Reveal its Secrets
Freeman, R., Yin, H., Allinson, N., Self-Organising Maps for Tree View Based Hierarchical Document Clustering, Proceedings of the International Joint Conference on Neural Networks (IJCNN’02), Honolulu, Hawaii, vol. 2, pp. 1906–1911, (2002)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2002 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Russell, B., Yin, H., Allinson, N.M. (2002). Document Clustering Using the 1 + 1 Dimensional Self-Organising Map. In: Yin, H., Allinson, N., Freeman, R., Keane, J., Hubbard, S. (eds) Intelligent Data Engineering and Automated Learning — IDEAL 2002. IDEAL 2002. Lecture Notes in Computer Science, vol 2412. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-45675-9_26
Download citation
DOI: https://doi.org/10.1007/3-540-45675-9_26
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-44025-3
Online ISBN: 978-3-540-45675-9
eBook Packages: Springer Book Archive