Uncovering the Hierarchical Structure of Text Archives by Using an Unsupervised Neural Network with Adaptive Architecture

Merkl, Dieter; Rauber, Andreas

doi:10.1007/3-540-45571-X_46

Uncovering the Hierarchical Structure of Text Archives by Using an Unsupervised Neural Network with Adaptive Architecture

Dieter Merkl⁴ &
Andreas Rauber⁴

Conference paper
First Online: 01 January 2003

1697 Accesses
9 Citations

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 1805))

Abstract

Discovering the inherent structure in data has become one of the major challenges in data mining applications. It requires the development of stable and adaptive models that are capable of handling the typically very high-dimensional feature spaces. In this paper we present the Growing Hierarchical Self-Organizing Map (GH-SOM), a neural network model based on the self-organizing map. The main feature of this extended model is its capability of growing both in terms of map size as well as in a three-dimensional tree-structure in order to represent the hierarchical structure present in a data collection. This capability, combined with the stability of the self-organizing map for high-dimensional feature space representation, makes it an ideal tool for data analysis and exploration. We demonstrate the potential of this method with an application from the information retrieval domain, which is prototypical of the high-dimensional feature spaces frequently encountered in today’s applications.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

J. Blackmore and R. Miikkulainen. Incremental grid growing: Encoding high-dimensional structure into a two-dimensional feature map. In Proc. of the IEEE Int’l. Conf. on Neural Networks (ICNN’93), San Francisco, CA, USA, 1993.
Google Scholar
B. Fritzke. Growing grid — a self-organizing network with constant neighborhood range and adaption strength. Neural Processing Letters, 2, No. 5:1–5, 1995.
Article Google Scholar
T. Kohonen. Self-Organizing Maps. Springer Verlag, Berlin, Germany, 1995.
Google Scholar
T. Kohonen. Self-organization of very large document collections: State of the art. In Proc Int’l Conf on Artificial Neural Networks, Skövde, Sweden, 1998.
Google Scholar
X. Lin, D. Soergel, and G. Marchionini. A self-organizing semantic map for information retrieval. In Proc. Int’l ACM SIGIR Conf. on R & D in Information Retrieval, Chicago, IL, 1991.
Google Scholar
D. Merkl. Text classification with self-organizing maps: Some lessons learned. Neurocomputing, 21(1–3), 1998.
Google Scholar
D. Merkl. Text data mining. In A Handbook of Natural Language Processing: Techniques and Applications for the Processing of Language as Text. Marcel Dekker, New York, 1998.
Google Scholar
D. Merkl and A. Rauber. Alternative ways for cluster visualization in self-organizing maps. In Proc. of the Workshop on Self-Organizing Maps (WSOM97), Helsinki, Finland, 1997.
Google Scholar
D. Merkl and A. Rauber. Uncovering associations between documents. In Proc. International Joint Conference on Artificial Intelligence (IJCAI99), Stockholm, Sweden, 1999.
Google Scholar
R. Miikkulainen. Script recognition with hierarchical feature maps. Connection Science, 2:83–101, 1990.
Article Google Scholar
A. Rauber and D. Merkl. Automatic labeling of self-organizing maps: Making a treasure map reveal its secrets. In Proc. 4th Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD99), Beijing, China, 1999. Springer Verlag.
Google Scholar
A. Rauber and D. Merkl. Using self-organizing maps to organize document collections and to characterize subject matters: How to make a map tell the news of the world. In Proc. 10th Int’l Conf. on Database and Expert Systems Applications (DEXA99), Florence, Italy, 1999.
Google Scholar
D. Roussinov and M. Ramsey. Information forage through adaptive visualization. In Proc. ACM Conf. on Digital Libraries 98 (DL98), Pittsburgh, PA, USA, 1998.
Google Scholar
G. Salton. Automatic Text Processing: The Transformation, Analysis, and Retrieval of Information by Computer. Addison-Wesley, Reading, MA, 1989.
Google Scholar
A. Ultsch. Self-organizing neural networks for visualization and classification. In Information and Classification. Concepts, Methods and Application. Springer Verlag, 1993.
Google Scholar

Download references

Author information

Authors and Affiliations

Institut für Softwaretechnik, Technische Universität Wien, Favoritenstraße 9-11/188, A-1040, Wien, Austria
Dieter Merkl & Andreas Rauber

Authors

Dieter Merkl
View author publications
You can also search for this author in PubMed Google Scholar
Andreas Rauber
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Graduate School of Systems Management, Universiy of Tsukuba, 3-29-1 Otsuka, Bunkyo-ku, Tokyo, 112-0012, Japan
Takao Terano
Department of Computer Science and Engineering, Arizona State University, P.O. Box 875 406, Tempe, AZ, 85287-5406
Huan Liu
Department of Computer Science, National Tsing Hua University, Hsinchu, 300, Taiwan ROC
Arbee L. P. Chen

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Merkl, D., Rauber, A. (2000). Uncovering the Hierarchical Structure of Text Archives by Using an Unsupervised Neural Network with Adaptive Architecture. In: Terano, T., Liu, H., Chen, A.L.P. (eds) Knowledge Discovery and Data Mining. Current Issues and New Applications. PAKDD 2000. Lecture Notes in Computer Science(), vol 1805. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-45571-X_46

Download citation

DOI: https://doi.org/10.1007/3-540-45571-X_46
Published: 24 March 2003
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-67382-8
Online ISBN: 978-3-540-45571-4
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics